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Restore justice in Turkey 


Hundreds of academics and scientists are among those caught up in political crackdowns in 
Turkey. The government should end the state of emergency. 


NUMBER CRUNCHING Maths gears WORLD VIEW Broaden 


latest blow to academics, 11 members of the Turkish Medical 

Association, including its president, Rasit Tikel, were arrested 
in early-morning raids last week. Their crime? Using the slogan that 
war is a matter of public health, the association had called for a halt 
to the Turkish army’s cross-border assault on military units of Syrian 
Kurds, launched on 20 January to international consternation. (The 
Kurdish units targeted have been fighting alongside US troops against 
the Islamist terrorist organization ISIS in northwest Syria.) The raids 
follow the arrest of more than a thousand academics who signed a 
petition in January 2016 calling for peace in the country’s southeast, 
where government forces were fighting Kurdish separatists. Many face 
criminal charges, and hundreds lost their jobs. 

University professors and scientists were also among the 150,000 
public servants who were detained and dismissed when draconian 
laws and a state of emergency were imposed after a failed coup in 
July 2016. Now, a report published on 18 January (see go.nature. 
com/2el9qze) by human-rights organizations in Turkey shows that 
many of those dismissed stand accused of supporting the FETO, or 
Giilen, organization believed by the government to have been behind 
the coup attempt. Membership in other terrorist organizations is also 
alleged and, as a result, many of these academics face serious terror- 
ism-related charges. 

The report details, as far as it is possible to do so, the arrests, detentions 
and trials of those caught up in the post-coup purges, and raises con- 
cern that miscarriages of justice might be occurring ona large scale. 
Universities have been hard hit — the report says that 5,822 professors 
and researchers lost their jobs, 380 of whom were signatories of the 2016 
Academics for Peace petition. More than 21,000 health-care profession- 
als were among those fired from public service, and a further 4,113 were 
judges and prosecutors; their loss partly explains why trials are moving 
forward so slowly. The report notes that even having downloaded a par- 
ticular encrypted smartphone text-messaging system (called ByLock), 
favoured by Giilenists and available only through personal introduction, 
was enough to condemn someone. 

The plight of Turkey’s academics must not be forgotten. They must 
be allowed fair hearings and trials without further delay. Telling their 
stories can be powerful, too. Last week, Nature published an interview 
with theoretical physicist Ali Kaya at Bogazici University in Istanbul, 
who has been charged with being a member of a terrorist organization, 
about how he managed to carry on his research during his 15 months 
of incarceration. Colleagues in other countries had tweeted about his 
achievement — a tactic that other scientists might adopt to help their 
colleagues in Turkey avoid falling out of the public consciousness. 

The general situation in Turkey — whose president is becoming 
increasingly authoritarian and bellicose, and which hovers on outright 
civil war — endangers the serious efforts the country has recently been 
making to improve its research base. As one part of the government 


Pp eace is a dangerous cause to fight for in Turkey right now. In the 


oversees mass arrests and orchestrates war, other parts are quietly but 
determinedly working to fix some of the entrenched problems in the 
research system. Thousands of new PhD places have been created in 
recent years, along with some brand-new research institutes, and uni- 

versities have been energized into competing 


“The situation with each other by offering financial rewards 
in Turkey for strong performers. It is a start, and has 
endangers been enough to persuade at least some young 
efforts to scientists doing postdocs abroad to return 
improve its home to establish independent research labs. 


research base.” This is a source of hope in more ways than 
one. Science can provide a channel for main- 
taining contact and discussion between countries and cultures in polliti- 
cally tense times. Inevitably, however, like other professionals, many of 
the scientists making successful careers in Turkey have half-formed emi- 
gration plans in mind. The Turkish government needs to make its sci- 
entists feel safe. It should revoke the newly extended state of emergency, 


which has long since outlived its legitimate purpose. = 


Hardware upgrade 


Artificial intelligence is driving the next wave 
of semiconductor innovations. 


apps and programs that can track the health of people and eco- 

systems, analyse big data and beat human champions at Go. 
Meanwhile, efforts to introduce sweeping changes to the hardware that 
underlies all that innovation have gone relatively unnoticed. 

Since the start of the year, the Semiconductor Research Corpora- 
tion (SRC) — a consortium of companies, academia and govern- 
ment agencies that helps to shape the future of semiconductors — has 
announced six new university centres. Having watched the software 
giant Google expand into hardware research on artificial intelligence 
(AJ), the main chip manufacturers are moving to reclaim the territory. 
As they do so, they are eyeing the start of a significant transforma- 
tion — arguably the first major shift in architectures since the birth 
of computing. 

This would be important to science: research in fields from astron- 
omy and particle physics to neuroscience, genomics and drug discovery 
would like to use AI to analyse and find trends in huge sets of data. But 
this places new demands on traditional computer hardware. The con- 
ventional von Neumann architecture keeps data-storage units inside 
computers separate from data-processing units. Shuttling information 


A dvances in computing tend to focus on software: the flashy 
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back and forth between them takes time and power, and creates a bot- 
tleneck in performance. 

To take advantage of AI technology, hardware engineers are looking 
to build computers that go beyond the constraints of von Neumann 
design. This would be a big step forward. For decades, advances in com- 
puting have been driven by scaling down the size of the components, 
guided by Gordon Moore’s prediction that the number of transistors on 
a chip roughly doubles every two years — which generally meant that 
processing power did the same. 

Modern computers bear little resemblance to early machines that 
used punch cards to store information and mechanical relays to perform 
calculations. Integrated circuits now contain transistors so small that 
more than 100 million of them would fit on the head of a pin. Yet the 
fundamental design of separate memory and processing remains, and 
that places a limit on what can be achieved. 

One solution could be to merge the memory and processing units, 
but performing computational tasks within a memory unit is a major 
technical challenge. 

Google’ AlphaGo research shows a possible, different, way forward. 
The company has produced new hardware called a tensor processing 
unit, with an architecture that enables many more operations to be 
performed simultaneously. This approach to parallel processing sig- 
nificantly increases the speed and energy efficiency of computationally 
intensive calculations. And designs that relax the strict need to per- 
form exact and error-free computation — a change in strategy known 
as approximate computing — could increase these benefits further. 

Asa result, the power consumption of AI programs such as AlphaGo 
has improved dramatically. But increasing the energy efficiency of such 
hardware is essential for AI to become widely accessible. 

The human brain is the most energy-efficient processor around, so 


it is natural for hardware developers to try to mimic it. An approach 
called neuromorphic computing aims to do just that, with technolo- 
gies that seek to simulate communication and processing in a bio- 
logical nervous system. Several neuromorphic systems have already 
demonstrated the ability to emulate collections of neurons on tasks 
such as pattern recognition. 
These are baby steps, and now the SRC has stepped in to try to encour- 
age the hardware to walk. Under its Joint University Microelectronics 
Program, the SRC has quietly placed its focus 
“The on developing hardware architecture. A new 


fundamental centre at Purdue University in West Lafayette, 
design of Indiana, for example will research neuromor- 
separate phic computing, and one at the University of 
memory and Virginia in Charlottesville will develop ways of 
processing harnessing computer memory for extra pro- 
places a limit cessing power. 

on what can be This technological task is huge. So it is 


heartening to see the SRC, traditionally US- 
centric, opening its doors. South Korean firm 
Samsung joined in late 2017, the fifth foreign company to sign up in the 
past two years. This is a welcome sign of collaboration. But that commer- 
cial rivals would work together in this way also signals how technically 
difficult the industry thinks it will be to develop new hardware systems. 

As this research develops, Nature looks forward to covering progress 
and publishing results. We welcome papers that will enable computing 
architectures beyond von Neumann, such as components for neuro- 
morphic chips and in-memory processing. Scientists across many fields 
are waiting for the result: computers powerful enough to sift all of their 
new-found data. They will have to wait a while yet. But the wait should 
be worth it. = 


achieved.” 


Maths revision 


A decadal update of academic mathematics 
shows the value of taking one’s time. 


athematics has its own way of doing things. Not for 
M mathematicians the breakneck chase after the latest academic 

fad. “It goes up and down over the centuries,’ said one expert, 
when asked whether fluid dynamics — her focus — is now trendy. 

Maths moves at its own pace, and the field is currently involved in 
a global effort to analyse, audit and agree new classifications of how 
mathematicians study and make use of maths. The MSC2020 system, 
due to appear in 2020, will formally approve new categories of maths, 
and split existing definitions into finer classes. 

MSC stands for Mathematics Subject Classification, and it provides 
taxonomical order. In the current MSC2010, for instance, the code 03 
represents mathematical logic and foundations. Going deeper, 03E is 
set theory and 03E72 is fuzzy set theory. 

Why bother? The system is jointly managed by the mathematical 
resource ZoMATH, curated by the Leibniz Institute for Information 
Infrastructure in Karlsruhe, Germany, and by the American Math- 
ematical Society’s Mathematical Reviews. Each is a ‘meta-journal that 
systematically summarizes and reviews every paper that comes out in 
the peer-reviewed mathematical literature. Mathematical Reviews and 
zbMATH use the MSC in their internal workflows, and many other 
journals have adopted the system to assign submissions to editors and 
reviewers. Mathematicians also use the numerical codes to search for 
papers in their speciality. 

To keep the system up to date, every ten years the two organizations 
consult reviewers and request suggestions for new entries from the 
broader community. Nominations opened in July 2016 and close this 


August. A theme emerging for proposed new categories is for fields that 
mix traditional disciplines — such as ‘algebraic statistics’ and ‘numerical 
algebraic geometry. 

Take topological data analysis, a popular candidate for inclusion. The 
theory has its roots in topology — the study of shapes and their arrange- 
ments within one another — which includes knot theory and higher- 
dimensional spaces. For more than a century, topology was mostly a 
pure-maths affair. But researchers have found ways to use it to give struc- 
ture to large data sets, and so topological data analysis has been born. 

More generally, the revision takes the pulse of broader cultural shifts. 
Suggested new categories indicate that more mathematicians have 
started to collaborate with researchers in other fields. 

Recognition ofa new subfield can depend on building citations, and 
that is a slow process in maths. A recent study of some 20 million refer- 
ences for more than 900,000 mathematical articles in zh MATH found 
that the time it takes for a paper's citations to peak is several years longer 
than in other fields — and is lengthening. Consequently, it takes a while 
for even the most dramatic breakthroughs to register in the MSC sys- 
tem. Many mathematicians expect Peter Scholze, a number theorist at 
the University of Bonn in Germany, to win a Fields Medal this year for 
his pioneering work on perfectoid spaces. But, as a research category, 
perfectoid spaces — only around since 2010 or so — is probably too 
undercooked yet to make the cut for MSC2020. 

Can such a rigid hierarchy survive in an age of fluid metadata and 
keyword tagging? For now, it remains relevant. Studies have found 
a high correlation between clustering of the mathematical literature 
into topics — as measured from citation networks — and the MSC, at 
least at its upper levels. But things might change. For its own journals, 
for example, the American Physical Society changed in 2016 from a 
system similar to the MSC to a hybrid one called Physics Subject Head- 
ings. This has both a hierarchical tree of subfields and a broader set of 
‘facets’ that cut across them like a Venn diagram, encompassing many 
terms. Maths might do the same at some point — but, quite correctly, 
in its own time. Maths has no need to start following fashion now. = 
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CORRECTION 

The Editorial ‘Maths revision’ (Nature 
554, 146; 2018) mistranslated the name of 
the Leibniz Institute. It is actually the Leibniz 
Institute for Information Infrastructure. 
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States Declaration of Independence holds it self-evident that 
“all men [sic] are created equal’, but equality remains a far-off 
dream for many Americans. 

The San Francisco Declaration on Research Assessment (DORA; 
https://sfdora.org) is much younger, but similarly idealistic. Conceived 
by a group of journal editors and publishers at a meeting of the Ameri- 
can Society for Cell Biology (ASCB) in December 2012, it proclaims 
a pressing need to improve how scientific research is evaluated, and 
asks scientists, funders, institutions and publishers to forswear using 
journal impact factors (JIFs) to judge individual researchers. 

DORAs aim is a world in which the content of a research paper 
matters more than the impact factor of the journal in which it appears. 
Thousands of individuals and hundreds of research organizations now 
agree and have signed up. Momentum is build- 
ing, particularly in the United Kingdom, where 
the number of university signatories has trebled 
in the past two years. This week, all seven UK 
research councils announced their support. 

Impact factors were never meant to be a metric 
for individual papers, let alone individual people. 
They're an average of the skewed distribution of 
citations accumulated by papers in a given jour- 
nal over two years. Not only do these averages 
hide huge variations between papers in the same 
journal, but citations are imperfect measures of 
quality and influence. High-impact-factor jour- 
nals may publish a lot of top-notch science, but 
we should not outsource evaluation of individual 
researchers and their outputs to seductive journal metrics. 

Most agree that yoking career rewards to JIFs is distorting science. 
Yet the practice seems impossible to root out. In China, for example, 
many universities pay impact-factor-related bonuses, inspired by 
unwritten norms of the West. Scientists in parts of Eastern Europe 
cling to impact factors as a crude bulwark against cronyism. More 
worryingly, processes for JIF-free assessment have yet to gain credibil- 
ity even at some institutions that have signed DORA. Stories percolate 
of research managers demanding high impact factors. Job and grant 
applicants feel that they can’t compete unless they publish in promi- 
nent journals. All are fearful of shrugging off the familiar harness. 

So, DORA’ job now is to accelerate the change it called for. I feel 
the need for change whenever I meet postdocs. Their curiosity about 
the world and determination to improve it burns bright. But their 
desires to pursue the most fascinating and most impactful questions 
are subverted by our systems of evaluation. As they apply for their first 
permanent positions, they are already calculating how to manoeuvre 
within the JIF-dependent managerialism of modern science. 

There have been many calls for something better, including the 
Leiden Manifesto and the UK report “The Metric Tide’ both released in 


D eclarations are bound to fall short. The 240-year-old United 


IT’S WORTH 
DOING THE 


EXPERIMENT 
TO PROPERLY 
EVALUATE 


EVALUATION. 


Words were a good start — 
now it is time for action 


Five years ago, the Declaration on Research Assessment was a rallying point. 
It must now become a tool for fair evaluation, urges Stephen Curry. 


2015. Like DORA, these have changed the tenor of discussions around 
researcher assessment and paved the way for change. 

It is time to shift from making declarations to finding solutions. 
With the support of the ASCB, Cancer Research UK, the European 
Molecular Biology Organization, the biomedical funder the Wellcome 
Trust and the publishers the Company of Biologists, eLife, F1000, 
Hindawi and PLOS, DORA has hired a full-time community manager 
and revamped its steering committee, which I head. We are committed 
to getting on with the job. 

Our goal is to discover and disseminate examples of good practice, 
and to boost the profile of assessment reform. We will do that at con- 
ferences and in online discussions; we will also establish regional 
nodes across the world, run by volunteers who will work to identify 
and address local issues. 

This week, for example, DORA is participating 
ina workshop at which the Forum for Responsible 
Metrics — an expert group established following 
the release of “The Metric Tide’ — will present 
results of the first UK-wide survey of research 
assessment. This will bring broader exposure to 
what universities are thinking and doing, and put 
the spotlight on instances of good and bad practice. 

We have to get beyond complaining, to find 
robust, efficient and bias-free assessment meth- 
ods. Right now, there are few compelling options. 
I favour concise one- or two-page ‘bio-sketches; 
similar to those rolled out in 2016 by the Univer- 
sity Medical Centre Utrecht in the Netherlands. 
These let researchers summarize their most 
important research contributions, plus mentoring, societal engagement 
and other valuable activities. This approach could have flaws. Perhaps 
it gives too much leeway for ‘spin’ But, as scientists, surely we can agree 
that it’s worth doing the experiment to properly evaluate evaluation. 

This is hard stuff: we need frank discussions that grind through 
details, with researchers themselves, to find out what works and to 
forestall problems. We need to be mindful of the damage wrought 
to the careers of women and minorities by bias in peer review and in 
subjective evaluations. And we need to join in with parallel moves 
towards open research, data and code sharing, and the proper rec- 
ognition of scientific reproducibility. 

Declarations such as DORA are important; credible alternatives to 
the status quo are more so. True success will mean every institution, 
everywhere in the world, bragging about the quality of their research- 
assessment procedures, rather than the size of their impact factors. m 


Stephen Curry is a professor of structural biology and assistant 
provost for equality, diversity and inclusion at Imperial College 
London. He is also chair of the DORA steering group. 

e-mail: s.curry@imperial.ac.uk 
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POLICY 


Ivory trade banned 


Hong Kong lawmakers 

voted on 31 January to ban 
the trading of ivory in the 
Chinese territory, which is the 
world’s largest ivory market. 
The ban will be implemented 
in phases, and will ultimately 
require traders to dispose 

of their stock by 2021. 
Conservationists hailed 

the move as a victory for 
elephant preservation. 
Although ivory sales are 
banned in most of the world 
under a 1990 treaty, sales of 
antiques made of the material 
have remained legal in Hong 
Kong, providing cover for 
illegal trade in fresh ivory. 

A ban on the substance 

in mainland China — the 
biggest market for Hong Kong 
ivory — came into effect on 
31 December last year. 


PEOPLE 


Investigator death 


Esmond Bradley Martin, 

a veteran investigator in the 
fight against ivory poaching 
in Africa, was found dead 

at his home in Nairobi on 

4 February. Media reports said 
that the 76-year-old had been 
stabbed in the neck. Kenyan 
police have arrested four 
people in connection with the 
death, although the motive 
for the crime remains 
unclear. Bradley Martin, a 
US citizen, had spent decades 
investigating the trade in 
elephant tusks and rhino 
horn across Africa and Asia. 
His work influenced China's 
decision to end its legal 
rhino-horn trade in 1993, as 
well as the country’s ban on 
ivory sales, which came into 
force on 31 December 2017. 
China's ivory trade has been 
widely blamed for driving 
elephant poaching; in 

recent years, Bradley Martin 
showed that illicit trade 


Truck tracks harm ancient Peruvian site 


Peru's ancient Nazca Lines have been damaged 
bya truck that drove over the cultural site 
without permission, the country’s ministry of 
culture said on 29 January. The incident, which 
occurred on 27 January, left deep tracks over 

an area roughly 50 metres by 100 metres and 
affected three geoglyphs — images scratched into 


was moving from China to 
neighbouring countries, such 
as Laos and Vietnam. 


Astronomy move 
Astrophysicist Christian 

Ott, who in 2015 was found 
by the California Institute 

of Technology in Pasadena 
to have committed gender- 
based harassment against two 
graduate students there, is 
moving to the University of 
Turku in Finland. “Dr Ott’s 
past was known during the 
recruitment process, and the 
matter has been carefully 
considered,’ the University 
of Turku said ina 1 February 
statement, noting that Ott 
will not have supervisory 
responsibilities. The position 
is for 2 years, with a 4-month 
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trial period. His appointment 
has drawn criticism from 
other astronomers, who say 
they are concerned about the 
message the hiring sends to 
those who have been harassed. 
Ott was a visiting professor 

at Kyoto University from 
March to June 2017, before 
leaving Caltech last December. 
Through his lawyer, Ott 
declined Nature’s request to 
comment on the matter. 


Health chief resigns 
The head of the US Centers 
for Disease Control and 
Prevention (CDC) resigned 
on 31 January, soon after 
coming under fire in the 

press for trading stock in 
tobacco companies while 
leading the agency. Brenda 


the ground. A truck driver was detained later 
that day, but subsequently released by a judge; 
prosecutors have appealed against the ruling. 
The Nazca Lines were constructed between 
500Bc and ap 500, and are thought to have been 
used in astronomy. They have been designated a 
World Heritage Site by the United Nations. 


Fitzgerald had served as CDC 
director since last July. On 

30 January, Politico broke 

a story about her tobacco 
holdings. A spokesman for 
the US Department of Health 
and Human Services said in 

a statement that Fitzgerald’s 
financial holdings required “a 
broad recusal” that would limit 
her ability to perform her job. 


FACILITIES 


Neutrino test axed 
Italy’s physics agency, the 
INEN, has cancelled a 
planned neutrino experiment. 
The Short distance neutrino 
Oscillations with BoreXino 
(SOX) experiment, a 
collaboration with the CEA, 
France’s nuclear agency, 


GENRY BAUTISTA/AGENCIA ANDINA/EPA 


SHATTIL & ROZINSKI/NPL 


SOURCE: CABINET OFFICE OF JAPAN 


was designed to determine 
whether there is a fourth, 
‘sterile’ type of neutrino in 
addition to the three known 
ones. The partnership had 
planned to install a high- 
intensity neutrino source at 
the Gran Sasso underground 
physics laboratories in 
central Italy, which is home 
to the Borexino neutrino 
detector. But Mayak, a 
Russian firm contracted to 
make the source, has said 
that it will be unable to 
extract enough cerium-144 
from nuclear waste for the 
18-month experiment. 

SOX spokesperson Marco 
Pallavicini says that the 
agencies had spent about 
one-third of the estimated 
€6-million (US$7.5-million) 
cost of the experiment. 


Wolf lawsuit 


Conservation groups filed 

a lawsuit on 30 January 

to compel the US Fish 

and Wildlife Service to 
strengthen its plan to rescue 
the endangered Mexican 
wolf (Canis lupus baileyi) 
from extinction. The 
animal (pictured) is a rare 
subspecies of grey wolf; an 
estimated 150 individuals 
roam New Mexico, Arizona 
and northern Mexico. The 
lawsuit alleges that the federal 
agency failed to take steps 
to help the canines recover. 


TREND WATCH 


Japan will increase its spending 
on science and technology by 7% 
to ¥3.84 trillion (US$35 billion) in 
2018 compared with the previous 
year, the government’s science 


advisory body announced on 


30 January. The rise comes after 
stagnant growth in the science 


budget since the early 2000s. 
Prime Minister Shinzo Abe’s 


government aims to boost Japan's 
science and technology budget 
by ¥300 billion per year to meet 

a goal of spending 1% of gross 
domestic product on research by 


2020, up from 0.65% in 2015. 


Leading wolf biologists have 
recommended expanding 
the animal's range and 
establishing new populations 
in the southwestern United 
States. The groups also claim 
that the plan inadequately 
addresses high levels of 
inbreeding in the Mexican- 
wolf population, a significant 
threat to its future existence. 


Satellite launch 

Italy and China have launched 
a satellite that will monitor 
electromagnetic phenomena 
from space that may be linked 
to earthquakes and other 
seismic activity on Earth. 

On 2 February, a rocket 
carrying Zhangheng-1, also 
called the China Seismo- 
Electromagnetic Satellite, lifted 
off from the Jiuquan Satellite 
Launch Centre in the Gobi 
Desert in Inner Mongolia. 
The satellite is equipped with 


JAPAN’S SCIENCE SPENDING 


nine scientific instruments. 
Data they collect will be used 
to develop new methods of 
studying earthquakes from 
space, say researchers involved 
in the project. Zhangheng-1 

is expected to be in orbit for 
five years. 


Fertility licences 


Two women have been 
approved to be the first in the 
United Kingdom to receive an 
in vitro fertilization procedure 
called mitochondrial 
replacement therapy (MRT), 
which uses the DNA of three 
people. MRT reduces the risk 
of women passing on certain 
inherited diseases caused by 
mutations in mitochondrial 
DNA. The United Kingdom 
legalized the procedure in 
2015 after a parliamentary 
vote, but people wishing to 


This year, the government adopted a standardized method for counting 
science spending by individual ministries to follow the practices of the 
Organisation for Economic Co-operation and Development. 


Science and technology spending across all 
government ministries (¥, trillion) 


LO vvrctteteet ttre tte etter 


The new method 
added an extra 
m™ ¥80.3 billion to the 


science budget 

bottom line in 2018, 

compared with the 
m old method. 
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undergo treatment must be 
approved individually by the 
country’s fertility-treatment 
regulator, the Human 
Fertilisation and Embryology 
Authority. The procedures will 
be carried out at Newcastle 
Fertility Centre. MRT has 
already been successfully 
performed in Mexico and 
Ukraine. 


| FUNDING 
Funding boost 


India will increase its 
investment in science by 

10% to 536.2 billion rupees 
(US$8.4 billion) for 2018-19, 
compared with the previous 
year. The budget, released 

on 1 February, includes 

30.7 billion rupees earmarked 
for a digital programme that 
includes artificial intelligence 
and cyber systems. Despite 
the new money, the country’s 
spending on science will 
remain relatively low, at 
around 0.8% of gross domestic 
product (GDP). Scientists 
have called for the government 
to boost investment in science, 
technology and research to 3% 
of GDP. 


Al start-up fund 
Leading machine-learning 
researcher Andrew Ng has 
launched a US$175-million 
fund dedicated to nurturing 
artificial intelligence (AI) 
start-up companies. Ng, who 
was previously head of AI 

at Chinese tech giant Baidu, 
has raised money from some 
of Silicon Valley’s biggest 
investors. He wrote in a blog 
post on 30 January that the 
effort, called the AI Fund, will 
aim to build companies from 
scratch and allow quickly 
developing AI firms to focus 
on research rather than on 
fundraising. In December, 
Ng launched Landing.Al, a 
start-up aimed at bringing AI 
to the manufacturing industry 
— for example, by developing 
automated visual-inspection 
systems to spot defects in 
products. 


> NATURE.COM 
For daily news updates see: 
WWW.nature.com/news 
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NEWSIN FOCUS 


South Korea 
investigates researchers who 
put kids as authors p.154 


Countries consider 
curbing black-carbon 
pollution from ships p.155 


Invasive crayfish 
species evolved through 
unusual method p.157 


Researchers 
grapple with 
decolonization p.159 


Earth’s first flower 


Some scientists doubt a statistical prediction of the ancestral blossom’s structure. 


BY HEIDI LEDFORD 


n ambitious effort to reconstruct the 

world’s first flower has seeded a debate 

over what that blossom looked like 
— and, more broadly, which forms a flower 
can take. 

The project, called FLOWER, combined 
an unparalleled database of plant traits, 
reams of molecular data on evolutionary 
relationships, and complex statistical models 
to determine what the ancestor of all mod- 
ern flowering plants might have looked like. 


F ee ea : on : of “i 
E Pie Re ae Se. 


Sunflowers and all other flowering plants probably arose from a common ancestor. 


When the study’s results were published last 
August (H. Sauquet et al. Nature Commun. 8, 
16047; 2017), they drew intense interest from 
academics and the media. 

But since then, researchers have raised 
questions about some of eFLOWER’s pre- 
dictions. On 31 January, plant morphologist 
Dmitry Sokoloff at Moscow State University 
and his colleagues published a reanalysis of 
the data that suggests a different arrangement 
of key female reproductive structures in the 
first flower (D. Sokoloff et al. Am. J. Bot. http:// 
dx.doi.org/10.1002/ajb2.1003; 2018). 
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The debate centres on the finer points of 
flower architecture, but raises a broader con- 
cern about using statistical models and large 
data sets to tackle biological questions, says 
Pamela Soltis, a plant biologist at the Univer- 
sity of Florida in Gainesville. “Things can be 
statistically possible without being biologically 
possible,’ she says. 

Flowering plants are a remarkable evol- 
utionary success. Although they appeared 
as recently as 140 million years ago — about 
200 million years after the first seed plants 
— they now make up about 90% ofall living 
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> land plants. But fossil flowers are scarce, and 
botanists have long speculated about what the 
first blooms might have looked like. “The flower 
was responsible for this massive diversification,’ 
says Soltis. “We can’t understand how we got to 
where we are without understanding what the 
first one was like.” 

About eight years ago, the eFLOWER project 
enlisted a team of botanical experts to find out. 
The team catalogued more than 20 traits in 
nearly 800 species. They then matched these 
data with molecular studies of evolutionary 
relationships, and used statistical modelling to 
infer the features of the earliest flower. 


BUDDING DOUBTS 

The results painted a picture of a flower that 
was symmetric around a central axis and con- 
tained both male and female sex organs. The 
eFLOWER models also suggested that many 
organs in the first flower were whorled, mean- 
ing they were arranged regularly in concentric 
circles when viewed from above. But the authors 
also warned that statistical support for some of 
these findings was weak. 

Even so, the idea of a whorled ancestral 
flower shocked some people, says Hervé 
Sauquet, a lead author on the eEFLOWER 
paper and an evolutionary biologist now at 
the Royal Botanic Garden in Sydney, Australia. 
Many plant scientists expected that the bloom’s 


organs would have been staggered in a spiral 
coiled around a central axis. “Tt was a long-held 
dogma that was never confirmed,’ he says. 

But what puzzled Sokoloff was that in Sau- 
quet’s analysis, the flower’s petals and male 
reproductive parts were arranged in whorls, 
yet the female reproductive organs, carpels, 
were arranged in a spiral. He had never seen 
this combination of whorled and spiral organs 
ina single flower. Moreover, he and his col- 
leagues suggest that 


it might not be devel- “Things can 

opmentally possible be statistically 

for plants to achieve possible 

two different arrange- without being 

ments of organs in biologically 

one flower. possible. 7 
That’s because the 


organs emerge from the same region of the 
plant, Sokoloff says. In some whorled flowers, 
the position of the carpels dictates the position 
of the male reproductive organs. Sokoloff’s team 
picked back through the eFLOWER database 
and found four examples in which whorled 
and spiral organs had been identified within 
the same flower. But after further analysis, they 
decided that each example contained only one 
type of reproductive organ. 

Sauquet says that his team has since revisited 
those data and agreed with some, although 
not all, of Sokoloff’s concerns. Repeating their 


analysis with an updated and expanded data 
set, they now find that all reproductive organs 
in the ancestral flower were probably whorled, 
he says. But some of the revised results had a 
relatively low degree of statistical support, 
just as the first analysis did. “It wasn't certain 
before, and it remains uncertain, Sauquet says. 
“We dont know the final answer yet.” 

Sokoloff says that a fundamental problem 
of eFLOWER’s approach was evaluating each 
trait of a flower independently before assem- 
bling those traits into a coherent bloom. “They 
analysed the evolution of each character sepa- 
rately,” he says. “But some combinations of 
characters are impossible” 

Even so, Sauquet argues that the absence of 
a particular form in modern flowers does not 
mean that it never existed. “There are a lot of 
weird things that existed before that we cannot 
see nowadays,’ he says. 

Settling the debate over the first flower will 
take a bigger database and more-sophisticated 
models, says Wenheng Zhang, who studies 
plant evolution at Virginia Commonwealth 
University in Richmond. But the eEFLOWER 
effort is an example of how modern techniques 
can be married to classical morphology to tackle 
fundamental questions about plant origins, she 
says. “This kind of study redirects botanists to 
look at the morphology,’ Zhang says. “It just 
comes back to the basics.” m 


SOUTH KOREA 


Child authors spark probe 


Researchers may have added relatives to papers to boost their chances at university. 


BY MARK ZASTROW 


r | The South Korean government is 
expanding an investigation into 
researchers who named their children 

as co-authors on papers. In some cases, the 

practice is thought to be intended to give the 
children an edge when applying to univer- 
sity, a highly competitive process in South 

Korea. The education ministry announced 

on | February that it would extend its origi- 

nal investigation, which last month identified 


2 


IMAGES OF THE MONTH 


82 academic papers on which authors had 
named their children or relatives — many 
of them in middle or high school — as 
co-authors. 

And on 4 February, the science ministry 
launched its own investigation into several 
of the country’s elite technical universities, 
which had not been included in the education 
ministry's initial probe. 

The 82 papers with child authors were 
uncovered in a month-long review of arti- 
cles written by more than 70,000 full-time 


university staff members across arts and 
sciences over 10 years. The review was 
prompted by a single case of child author- 
ship that came to light late last year, at Seoul 
National University. 

The investigation results, released on 
25 January, found examples from 29 South 
Korean universities. In 39 of the papers, the 
students seemed to have participated in the 
research as part of a programme related to 
their school curriculum; the other 43 appeared 
not to have, according to the investigation. 
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The education ministry has not released 
the names of the researchers involved in the 
cases, nor the journals in which they pub- 
lished. However, according to South Korean 
media reports, many of the papers appeared 
in journals included in the Science Citation 
Index (SCI). The ministry told Nature that 
the initial review relied on universities to 
self-report cases, and was not exhaustive 
because many staff members were on their 
winter holidays. 


INDEXED PAPERS 

In its continuing probe, the education 
ministry intends to examine papers by 
South Korean authors indexed in citation 
databases, including the SCI, Web of 
Science and Scopus, and to cross-check 
the names against the family relationships 
of 76,000 full-time faculty members. The 
investigation will run until 16 March. 

The ministry intends to refer each 
case to the corresponding university's 
research-ethics committee to confirm 
whether it constitutes misconduct or 
legitimate authorship. If the student 
co-authors did not participate in the 
research, academics will face possible 
disciplinary action, including dismissal, 
the ministry said. 

So far, the most-affected universities 
include some of Seoul’ elite tertiary insti- 
tutions: Sungkyunkwan University (eight 
cases), Yonsei University (seven cases), 
Seoul National University and Kookmin 
University (six each). A Sungkyunkwan 
spokesperson confirmed that the university 
would be opening probes as per the min- 
istry’s request, including possible penalties 
of dismissal. 

Yonsei University declined to answer 
Nature’s questions about the investigation, 
pending further information from the 
government. A Seoul National University 
spokesperson emphasized that there has 
not yet been any finding suggesting that 
actual misconduct had occurred, and that 
its research-integrity committee would 
investigate all cases. 

A spokesperson for Kookmin 
University told Nature that an initial 
review of the institution’s cases indicated 
that the collaborations were legitimate. 
“We have some records and notes that 
their children participated in a lot of 
activities. So we think we don't have any 
problem, he said. 

The practice has sparked a national 
outcry. In an editorial, the Korea Herald 
called the acts “no less than fraud, which 
greatly threatens the integrity of universities 
and education as a whole in Korea’. 

The education ministry said that any 
students listed as co-authors who did not 
participate in the research would have their 
university admission revoked. m 
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Most container ships burn heavy diesel fuel that produces black carbon. 


UN targets black 
carbon from ships 


Nations are advancing efforts to reduce sooty emissions. 


BY JEFF TOLLEFSON 


overnments are poised this week 

to begin discussing rules to curb 

black-carbon pollution from ships, 
after nearly seven years of preparation. The 
sooty emissions, which are produced by 
diesel engines, warm the climate and harm 
human health. 

At a meeting in London, a panel of the 
United Nations International Maritime 
Organization (IMO) is expected to agree on 
measurement techniques to gather data that 
could support eventual regulations. That is 
the second step in a three-step process begun 
in 2011. Agreeing on a definition for black 
carbon took four years; the final step, writing 
rules, could take a few more. 

Reducing the amount of black carbon 
emitted by ships could have a significant 
impact on the climate. The pollutant, a 
melange of particles and oil droplets that 
come in many shapes and sizes, is the 
second-largest driver of global warm- 
ing — behind only carbon dioxide. Diesel 
engines, such as those in ships, account for 
around one-fifth of the world’s black-carbon 
emissions, according to a study published in 


2013 (T. C. Bond et al. J. Geophys. Res. Atmos. 
118, 5380-5552; 2013). 

The pollution is also dangerous when 
inhaled, in part because black-carbon 
particles collect other contaminants — 
such as sulfuric acid and heavy metals 
— as they travel through the atmosphere. 
Advocates are pushing the IMO to speed up 
its negotiations, which involve more than 
170 countries. “We really only have 90 minutes 
per year where we are actively discussing the 
topic, so it’s easy to delay and to stall, says 
Bryan Comer, a senior researcher at the Inter- 
national Council on Clean Transportation, a 
non-profit research group in Washington DC. 

Although global black-carbon emissions 
from diesel engines on land are roughly 20 
times higher than those from ships’ engines, 
the health and environmental impacts of 
shipping pollution hits many busy ports and 
coastal areas disproportionately hard, says 
Daniel Lack, an independent consultant in 
Brisbane, Australia. “When you concentrate 
all of these ships into specific areas, all of a sud- 
den they become one of the most dominant 
sources of pollution” 

One area of special concern is the rapidly 
melting Arctic. The region’s shipping traffic > 
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> is projected to increase in the coming 
decades, as sea ice recedes — a thaw that 
could be exacerbated by particles of black 
carbon, which hasten melting when they 
land on snow and ice. 

Measuring black-carbon emissions is not 
a trivial task, Lack says. The most accurate, 
and expensive, technology fires a laser pulse 
through exhaust samples in a tube. Black- 
carbon particles absorb and then release 
the energy from the pulse, creating a pres- 
sure wave whose strength is equivalent to 
the amount of light that was absorbed. The 
shipping industry is pushing for a cheaper, 
but less accurate, method that draws 
exhaust through a filter; measurements of 
the reflectivity of that filter before and after 
use are then used to determine how much 
pollution a ship emitted. 

Both approaches could serve a purpose 
as the IMO moves forward with regula- 
tions, Comer says. But he adds that many of 
the regulatory actions that the organization 
could pursue to reduce black-carbon emis- 
sions do not require regular measurements 
from ships. Shifting from ‘heavy’ fuel oil 
to cleaner types — similar to those used in 
trucks — would reduce ships’ black-carbon 
output by 35-80%, depending on the engine. 
And installing filters on the vessels’ exhaust 
systems would cut emissions by at least 85%. 

The shipping industry is under pressure 
to curb other types of pollution. The United 
States, Canada and the European Union 
already require ships to use lower-sulfur 
fuels in some coastal zones. And in 2016, 
the IMO agreed to reduce the sulfur content 
in all shipping fuels from 3.5% to 0.5% by 
2020. That is good news for public health, 
but it could inadvertently exacerbate global 
warming, says James Corbett, an engineer 
at the University of Delaware’s School of 
Marine Science and Policy in Newark. 

In a study this week (M. Sofiev et al. 
Nature Commun. 9, 406; 2018), Corbett 
and his colleagues found that the IMO sul- 
fur standard could reduce global cardiovas- 
cular and lung-cancer deaths attributable 
to fine particulate matter by 2.6%, and the 
incidence of childhood asthma by 3.6%. 
But the new standard could accelerate 
climate change by decreasing the num- 
ber of bright, sulfur-containing particles 
in the atmosphere that cool the planet 
by reflecting sunlight back into space. 
The researchers estimate that this effect 
would increase the human contribution to 
warming by around 3%. 

“We're talking some big numbers,’ says 
Corbett. 

For Comer, that is all the more reason 
to press forward with black-carbon 
regulations. “It’s frustrating,” he says. 
“We already know how to control [black- 
carbon] emissions, but we're stuck going 
through the three-step process.” m 
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Cuts to Bulgaria’s science budget sparked strikes in November, following earlier protests by students. 


Funding cuts hit 
Bulgarian science 


A push to attract investment in innovation has floundered. 


BY INGA VESPER 


uropean Union science ministers met on 
E: February in their bloc’s poorest member 

state — Bulgaria — to discuss future EU 
research policy. For the host nation, it was sup- 
posed to be a chance to showcase ambitious 
plans to boost economic growth by attracting 
international research institutes to the country. 

But the timing of the event was awkward, to 
say the least. In July 2017, Bulgaria had been 
due to receive €150 million (US$186 million) 
from the EU to build facilities for research and 
innovation, under a programme that aims 
to boost economic growth in poor regions. 
The programme, which was expected to give 
Bulgaria €700 million between 2014 and 2020, 
is designed to help with the costs of research 
infrastructure. 

However, EU authorities withheld the 
money after Bulgaria failed to identify enough 
sufficiently qualified scientists to evaluate 
the proposals. Then, in November 2017, the 
Bulgarian government cut its 2018 science 
and higher education budget by around 25%, 
a move it had planned in anticipation of the 
windfall. 

The decision has frustrated scientists in 
Bulgaria, because they had wanted to use 
the new infrastructure to forge links with 


2018 


researchers outside the country. “Now, we 
cannot prepare proposals because we are not 
going to have the infrastructure,” says Ana 
Proykova, a physicist at Sofia University and 
an adviser on European research infrastruc- 
ture to Bulgaria's government. She says that 
the government should reinstate the funds it 
cut from the 2018 budget. “We are still fighting 
very strongly for the funding procedure to be 
reopened, even ifit is in the middle of this year. 
Otherwise, our budget is going to be very tiny.” 

Bulgaria, which took over the six-month 
rotating presidency of the EU on 1 January, 
produces little science compared with the 
bloc’s other member states. The country’s out- 
put is low (see ‘Bulgaria's output lags behind’), 
and more than 30% of PhD-holding Bulgarians 
are at present pursuing careers abroad. But sci- 
entists in Bulgaria hope for improvements. The 
country intends to bid for a proposed Balkan 
synchrotron particle accelerator, a light source 
that many hope will promote international 
diplomacy in the region. 

Its universities still want to tap into EU 
infrastructure funds. During its presidency, 
Bulgaria is also in charge of negotiating 
Framework 9, the EU’s latest seven-year plan 
for science, which is due to be finalized in 
May. It sees the plan, in part, as an opportu- 
nity for Bulgarian companies to enter into 
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lucrative contracts with international research 
consortia. “Industry is very important for us,” 
says Karina Angelieva, adviser for education 
and research at Bulgaria's permanent represen- 
tation to the EU, in Brussels. 


RAISED SCRUTINY 

These plans are now at risk, unless Bulgaria can 
persuade the EU’s regional-policy directorate 
general to release the frozen funds. Mean- 
while, the 2018 science and higher education 
budget stands at 2013 levels: just 415 million 
leva (US$263 million), plus another 98 million 
leva for the Bulgarian Academy of Sciences. 

The financial difficulties also threaten 
Bulgaria’s national research-infrastructure 
road map, which was published in June 2017. 
Kostadin Kostadinov, an adviser to the coun- 
try’s science and education minister, Krasimir 
Valchey, says that the road map “will increase 
research potential in Bulgaria according to the 
needs of local industry and regional develop- 
ment”, and that it is part of a plan ultimately to 
raise the country’s total science spending to 
1.5% of gross domestic product (GDP). That 
figure currently stands at 0.96% of GDP, which 
is less than half the EU average. 

Problems with science funding are 
exacerbated by corruption, say several scien- 
tists. Not only is Bulgaria the poorest country 
in the EU, itis also the most corrupt, according 
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Efforts to boost science in the European Union’s poorest country have been undermined by a lack of 
funds. Bulgaria’s production of published research has risen at a slower rate than that of its neighbours. 
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to Berlin-based lobby group Transparency 
International. Proykova says that science is 
rarely directly affected by monetary fraud, but 
corruption makes itself felt in procurement. 
“For example, things are never delivered to the 
lab, even though the money has been trans- 
ferred,’ she says. “Or, you get less good equip- 
ment for the same money, because the company 
takes some of the funds.” 

Some scientists see Bulgaria’s turn in the 
EU presidency as a chance for change. Lidia 


2010 


2012 2014 2016 


Borrell-Damian, director for research and 
innovation at the European University Asso- 
ciation in Brussels, says that it provides an 
opportunity for Bulgaria’s universities to 
connect with others. Daniel Smilov, a politi- 
cal scientist at Sofia University, hopes that the 
presidency will put the country’s problems on 
the map, forcing change from outside that has 
been lacking from within. “It is an important 
moment,’ he says, “because our visibility will 
be great.” m 


SOURCE: WEB OF SCIENCE 


DNA SEQUENCING 


Super-invasive crayfish 
revealed to be a genetic hybrid 


Scientists examine DNA of a marbled crayfish that is spreading ferociously. 


EWEN CALLAWAY 


olecular biologists have sequenced 
Me: genome of an invasive species 

of crayfish that can reproduce 
without mating and is spreading rapidly 
across Madagascar. The marbled crayfish 
(Procambarus virginalis) was first spotted in 
aquariums in Germany in the 1990s. Now, 
DNA sequencing suggests that the species is 
probably the product of two distantly related 
members of a different crayfish species, the 
team reported on 5 February in Nature Ecology 
and Evolution’. 

The marbled crayfish has already been 
banned in the European Union and some 
parts of the United States because of the threat 
it poses to freshwater ecosystems. The species 
has now spread into the interior of Madagascar 
and risks crowding out seven native crayfish 


species. “This is a very aggressive population,” 
says Frank Lyko, a molecular biologist at the 
German Cancer Research Center in Heidelberg, 
who co-led the study. “If the marbled crayfish 
continues to explode at its current pace, it will 

probably outcompete 


“If the marbled endemic species.” 

crayfish The marbled 
continues to crayfish carries 
explode at its three copies of 
current pace, each chromosome, 
it will probably instead of the usual 
outcompete two’. Lyko and his 


team sequenced the 
genome of a single 
individual from a laboratory strain known as 
Petshop. Its DNA revealed a surprise: it had 
two different genotypes at many places in its 
genome. The best explanation for this pattern, 
says Lyko, is that two of the chromosomes 


endemic species.” 


are nearly identical in sequence, but the third 
differs substantially. 

The two distinct genomes are closely 
related to those of another freshwater crayfish, 
Procambarus fallax, native to Florida and 
popular with aquarists. Lyko speculates that 
marbled crayfish emerged when the genome of 
asperm or egg of one P fallax individual became 
duplicated, which can happen in response to 
sudden changes in temperature. If this cell was 
then fertilized by another individual living in 
the same aquarium, it would have resulted in 
an embryo with three copies ofits genome, says 
Lyko. This would represent a new species. Lyko 
says that the first marbled crayfish was probably 
born in an aquarium in either Germany or the 
United States, and its offspring widely shared 
between fish collectors. 

The first scientific description of the 
marbled crayfish appeared in 2003, ina > 
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CORRECTION 

The News story ‘Super-invasive crayfish 
revealed to be a genetic hybrid’ (Nature 
554, 157-158; 2018) incorrectly stated that 
Julie Jones was the first to identify marbled 
crayfish in Madagascar. In fact, another 
team made the discovery; Jones and her 
team were the first to survey its spread in 
the nation. 
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The marbled crayfish threatens to crowd out seven native species in Madagascar. 


> Nature paper’ showing that all members 
of the species they surveyed were female and 
reproduced through parthenogenesis — a 
process by which an unfertilized egg develops 
into an adult with a genome identical to its 
mother’s. How the first marbled crayfish 
gained the ability to reproduce through 
parthenogenesis is a mystery, says Lyko. 

To better understand the species’ spread, 


Lyko’s team did more-limited DNA sequencing 
of 49 individuals caught across Madagascar. 
These studies showed a stunning lack of genetic 
diversity, owing presumably to the species’ 
recent origin and ability to reproduce through 
parthenogenesis. 

Julia Jones, a conservation scientist at Bangor 
University, UK, who first identified* marbled 
crayfish in Madagascar in 2007, says that the 


species’ spread is due largely to their popularity 
as a food source. In 2009, she met a man ona 
bus carrying a plastic bag full of them that he 
planned to dump into his rice fields in the hope 
of creating a sustainable stock, she says. 

Stopping their spread in Madagascar will be 
“almost impossible’, says Lyko. Collaborators 
there have begun campaigns urging people 
not to transport the creatures or release them 
into rice fields. The message is a hard sell in 
a country where poverty levels are high and 
marbled crayfish are a cheap and popular source 
of protein. Lyko’s colleague brought a few dozen 
that she had caught to a family barbecue. “This 
went down quite well? he says. m 
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CORRECTION 

The News Feature ‘The science that’s never 
been cited’ (Nature 552, 162-164; 2017) 
originally included a link to the data behind 
the charts. Nature has subsequently been 
told that the data are not available to make 
public, so the link has been removed online. 
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SAMANTHA REINDERS FOR NATURE 


BY LINDA 
NORDLING 


anga Zembe-Mkabile learnt a lot about herself from being 
uncomfortable in other people’s kitchens. In 2009, the South 
African social-policy researcher was collecting data for her PhD 
on the outcomes of government child-support grants. The research called 
for ‘cupboard inventories’ — taking stock of the food in study participants’ 
kitchens. But seeing the embarrassment in home after home as people 
opened their often-empty pantries, Zembe-Mkabile felt something was 
amiss. “It just didn't feel right to look into people's cupboards,’ she says. 
At the time, she did not act on her unease. Only years later, as an estab- 
lished scientist, did Zembe-Mkabile begin to understand the complex- 
ity of her apprehension. Community-based research often puts young 
scientists in a position of power over research participants, a role that 
can be daunting and unfamiliar. But for Zembe-Mkabile, the feelings 
went deeper. Shed known apartheid, and how its architects had used 
science to underpin their racist philosophies. The vestiges of that power 
imbalance were still there in her kitchen encounters. 
Zembe-Mkabile grew up around poverty, but as a scientist trained at 
the University of Oxford, UK, she was the product of a system shaped by 
and for white Europeans. The tension between these two roles — the 
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Wanga Zembe-Mkabile 
felt like both an insider 
and an outsider doing 
community research. 


Amid a tumultuous 
political landscape, 

a generation of black 
researchers is gearing 
up to transform South 
African science. 
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> insider and the outsider — is central to her identity as a researcher, 
and has shaped her thinking about how research can, and should, be 
done. Now, working at the South African Medical Research Council in 
Cape Town, she directs studies on how social policy relates to poverty, 
inequality and health. She plans to involve communities at the design 
stage of her experiments and, eventually, to include them in the analy- 
sis as well. Already, if she feels a tool or question is not appropriate for 
its setting, she eliminates it from her research. “Some questions are not 
worth exploring if they are going to trample on people’ dignity,” she says. 

Zembe-Mkabile thinks about her experiences a lot when she considers 
the mounting calls in her country to decolonize academia. Decoloniza- 
tion isa movement to eliminate, or at least mitigate, the disproportionate 
legacy of white European thought and culture in education. According to 
advocates, this is not just about increasing the number of black scientists, 
although such racial ‘transformation is an important part of the process. 
It also means dismantling the hegemony of European values and making 
way for the local philosophy and traditions that colonists had cast aside. 
Substantial literature from around the world supports the need to change 
curricula, and some South African universities have begun to take action 
and establish review committees. But the push for change is sometimes 
tense. Student demonstrations have wrapped arguments about decolo- 
nization into protests over university fees, and have resulted in disrupted 
classes, fires and millions of dollars spent on security and repairs. 

Science departments have struggled to define what decoloniza- 
tion means for their curricula and for research. Most are ramping up 
efforts to overcome the glaring under-representation of black scientists, 
but what comes next is unclear. Zembe-Mkabile’s generation, which 
straddles pre- and post-apartheid South Africa, will soon be leading 
the country’s research institutions as they grapple with the challenge of 
reformulating science for the new South Africa. 


THE ROLE MODELS 

South Africa, like many nations, is currently dealing with high unemploy- 
ment rates and glaring inequality. These are cast into sharp focus by the 
legacy of apartheid rule. Although political power has been in the hands 
of the black majority since the dawn of South African democracy 24 years 
ago, economic power remains with white people: white households in 
2015 earned around 4.5 times as much as black households, and whites 
hold more than 60% of top management positions, despite accounting 
for only 10% of the working population. In universities, black people 
account for not quite 35% of academics, despite making up about 80% of 
the population. Students, meanwhile, face multiple barriers to achieve- 
ment, including an education system that has left many unprepared for 
university studies. A 2015 government report found that black South A fri- 
cans had the highest dropout rate in the country; 32% leave their studies in 
their first year. As for curricula, African literature, philosophy, medicine 
and culture are often relegated to optional courses or skipped entirely. 

It is against this backdrop that researchers in Zembe-Mkabile’s 
generation forged their academic paths. Children during apartheid, 
they reached adulthood in the rainbow-coloured afterglow of Nelson 
Mandela’ release from prison in 1990. Some hail from communities that 
are distrustful of 
science. In Xhosa, 
Zembe-Mkabile’s 
home language, 
there isn’t even a 
word for research. 
The best approxi- 
mation, she says, is 
ukuphanda, which has negative connotations. “It means to search fora 
bad thing, like a police investigation,” she says. 

The scientists of Zembe-Mkabile’s generation are role models for 
the generation born after 1994, known locally as “born frees. Accord- 
ing to a study’ published last year, this generation is predicted to boost 
the country’s proportion of black researchers to more than 50% by 2025. 
That's a heavy burden for those like Zembe-Mkabile squeezed between 
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“THE GOAL WAS ASSIMILATION. 
THAT WAS THE ONLY WAY TO SURVIVE.” 


the demands of the academic system that 
trained them and the expectations of a youth 
clamouring for radical change. Zembe- 
Mkabile says that protesting against inequal- 
ity in the universities was not on the table 
when she was a student. “You entered these 
spaces and you were so grateful to be there 
that you didn’t question anything. We were 
fast asleep. At least now, students are alert.” 


DECOLONIZING THE MIND 

Zembe-Mkabile’s experience is not unique. 
Amanda Hlengwa, an academic developer 
at Rhodes University in Grahamstown, has 
similar memories of her undergraduate 
degree in Durban in the late 1990s. “The goal 
was assimilation. That was the only way to 
survive.” This is changing, she says: univer- 
sities are beginning to recognize students’ 
diverse backgrounds, and the challenges that 
university culture presents. But strategies to 
address this gap have been slow to material- 
ize and are unevenly implemented. 

Thaddeus Metz, a philosopher at the 
University of Johannesburg, agrees. A white 
American who settled in South Africa in 
2004, he was the first to teach African 
philosophy at the nearby University of the 
Witwatersrand, the city’s most 
prestigious research univer- 
sity, where he worked before 
his current post. “There is this 
long-standing intellectual tra- 
dition that has been neglected 
at best, at worst denigrated,” he says. He 
adds that the majority of students, regard- 
less of their race, are curious about African 
knowledge traditions, but that there's a lack 
of institutional leadership. Many in the humanities and social sciences 
are angry because they feel isolated and powerless. 

In the natural sciences it gets more complicated, because the meaning 
of decolonization is not well defined and its relevance is contested. Does 
decolonizing science mean throwing out Isaac Newton, Charles Darwin 
and Gregor Mendel, and starting afresh with indigenous knowledge? 
Such demands have been made, most famously by a University of Cape 
Town student in an online video of a campus discussion titled ‘Science 
must fall?. Metz says he’s encountered the argument. “Some of my col- 
leagues think that if something hasn’t come from Africa, it’s somehow 
disqualified.” 

But only a small minority of scientists hold such radical views. For 
most, decolonization of science calls for something more complex and 
subtle. “Decolonization is going to hap- 
pen in the mind,’ says Siyanda Makaula, 
a former cardiology lecturer who now 
works in university governance. Such 
shifts in thinking could mean, for exam- 
ple, that pharmacology students hear how 
drugs are being developed from plants 
their grandmothers used to treat stomach 
ache. This would show the relevance of traditional culture in modern 
science and anchor the curriculum in local experience. In other subjects, 
it could be about highlighting the contribution of non-Europeans, or 
facing the unsavoury history ofa discipline: for example, exploring how 
medical research had a role in fuelling racist ideas and how these were 
challenged and overturned. Across the board, it means ensuring that 
research addresses local problems and challenges. 


Nokwanda Makunga 
was warned off joining 
the faculty of a formerly 
all-white university. 
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Makaula thinks that scientists often hide behind their disciplines’ 
putative universality — that a cell is a cell, whether it belongs to an 
African or a European, or that the laws of physics apply to all — to avoid 
the need to question the way they do things. “It’s an excuse they use,” 
he says. But the point of science, he adds, is to find solutions for real- 
world problems. And for that, context needs to be part of how science 
is taught, he says. “It’s about how you teach it, how you apply it, how 
you make it relevant, so the person can receive it and absorb it better” 

Such refocusing is taking much too long in South African universities, 
says Makaula. And that inertia is costing the country dearly in terms of 
black research talent. He sees himself as a prime example. A decade ago, 
Makaula earned a PhD in cardiology. But repeated brushes with racism 
and tokenism — being asked, along with other black students, to meet 
potential funders while his white colleagues could stay in the lab — frus- 
trated him to the point that he left academia. Today, he works for the 
Council on Higher Education based in Pretoria, a public-sector body 
that deals with quality control and regulatory compliance in universities. 

On the face of it, South African universities are working on decoloniz- 
ing their academic offerings. Most have created committees to review 
their curricula — although few have much to show for it. And all are under 
pressure from government and funding bodies to train and hire more 
black academics. Research funders are following suit. A few years ago, 
the Medical Research Council dedicated a significant portion ofits largest 
grant programme to early-career scientists, and added weighting for gen- 
der and race. The proportion of the grants going to white investigators 
has since shrunk, from 72% in 2012 to 37% in 2016. The council is also 
working on a position statement on decolonization to sharpen its efforts 
to recruit black scientists, says Glenda Gray, the council’s president (see 


‘Three cultures’). It will look at how medical research can draw on social 
science to become more sensitive to community needs. “You only get true 
well-being if you understand the context in which the biological happens.” 


‘NO PLACE FOR A BLACK WOMAN’ 
Some South Africans approach decolonization as a way to rediscover their 
heritage. Nokwanda Makunga, a biotechnologist at Stellenbosch Univer- 
sity near Cape Town, grew up in the intellectual circles that gave rise to 
anti-apartheid freedom fighters such as Steve Biko and Nelson Mandela. 
From a young age, Makunga knew exactly what a scientist did. One of 
her early memories is of helping her father — a botanist — count kernels 
of maize (corn) for an experiment. In the dying years of apartheid, she 
attended a private boarding school in Grahamstown, where racial ten- 
sions were muted. It therefore came as a shock when, in 1990, she arrived 
at university in Pietermaritzburg in the politically fractious province now 
known as KwaZulu-Natal. “I came from a bubble that was non-racial, 
non-political. Then I was launched into the true South Africa.” It was 
a struggle. Her superior education and clipped private-school vowels 
singled her out as “too white” to belong with the black students. But she 
was also too black for the white students. “I was getting it from both ends.” 
In 2004, after earning her doctorate, Makunga yearned to move to a 
quieter, more research-focused institution. She got an offer from Stellen- 
bosch University, a formerly all-white institution nestled in the pictur- 
esque Cape Winelands. It offered stability, and a platform for Makunga 
to build an international reputation. But it had also, historically, been a 
bastion of white supremacy, having produced infamous apartheid-era 
prime ministers such as Hendrik Verwoerd and D. F. Malan. Some of 
Makunga’s friends were horrified. “One told me that Stellenbosch is no 
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THREE CULTURES 


FOREIGN BLACK RESEARCHERS FACE ADDITIONAL 


face opposition from patriarchal African 
cultures. This “masculinity of power’, she 
writes, needs to be challenged alongside 
colonialism and sexism. 


CHALLENGES IN SOUTH AFRICAN ACADEMIA Pholong is a scholar al mathematics 


education and deputy vice-chancellor of 


Black researchers are rapidly moving into South Africa’s academic spaces. But not all of research at the University of Cape Town. 
them are considered ‘black’ by the country’s Department of Higher Education and Training. Her outspokenness about the experience 


Researchers from other parts of the world are instead classified as ‘foreign’. 


of black researchers has charmed an army 


It’s a large and fast-growing segment. One report* found that although black PhD graduates of Instagram and Twitter followers. But at 
outnumbered whites for the first time in South Africa’s history in 2012, more than half of them times the veneration is misguided, she feels. 


hailed from countries such as Nigeria, Zimbabwe, Uganda and Kenya. 


It’s great to inspire young people to speak 


There are a few reasons for this. It is cheaper to study in South Africa than in Europe or the up and be themselves, “but I don’t want that 
United States, and the country offers better research facilities than elsewhere in Africa. But for to be my most powerful role’, she says. 
some locals, the growing presence of foreign black researchers is a problem. Jobs are scarce, and To her, the most important thing she has 
some believe that universities are more willing to hire non-South African black people than locals. done is excel as a researcher. Phakeng’s work 

So foreign black scientists — such as Thumbi Ndung’u, a Kenyan virologist based at the centres on mathematics and language. She 
University of KwaZulu-Natal in Durban — experience a special kind of alienation. “You can’t showed’, for example, that code-switching 
completely identify with the local black population. They see you as an outsider. On the other — alternating between languages — helps 


hand, you are not in the white old-boys’ club,” he says. 


multilingual people to understand mathe- 


Ndung’u had anticipated some friction when he moved to Durban in 2005 to study HIV. matical concepts. This is significant in South 
But it wasn’t until he lived there that he began to understand the frustrations of local black Africa, where students have been scolded 


academics. The system is blind to its own biases, he says. 


for using their home language. Today, code- 


Most of Ndung’u’s own graduate students are black South Africans. They face many challenges, switching is encouraged in many classrooms. 
he says, but, given the right support, they blossom. “There needs to be express effort to get them Yet, critics have argued that her field 
into the system. So South African universities don’t continue to have this problem in the future.” LN. is not a suitable background for the head 


place for a black woman. He didnt say 
why, just that it was very conservative.” 
Makunga took it as a challenge. “How 
will it ever be a place for black women if 
no black women are willing to go there?” 

Makunga’s research has brought her 
closer to her roots. When she was grow- 
ing up, her family did not use traditional 
medicine, she says. But now she studies 
South African medicinal plants, using 
modern biotechnology to explore their 
pharmacological properties — and she 
reckons the work is “pretty decolonial”. 
Having studied a variety of plants, in 2016 
she returned to the Eastern Cape, where 
she grew up, to learn about the traditional 
medicine practised by her ancestors. She 
takes her responsibility as custodian of 
these practices seriously. “I'm holding 
somebody else’s knowledge. I need to 
treat it with respect,’ she says. 

Stellenbosch has changed a lot since 
she was warned off, says Makunga, 
who is currently at the University of 
Minnesota in Minneapolis for a nine- 
month Fulbright scholarship. She feels 
welcome at Stellenbosch and valued as a 
black woman — still a rare occurrence at faculty level. Still, she longs for 
the day when that is not the headline issue; when she can be a scientist 
first and a black woman second. “I would like us to move beyond our 
apartheid race hangover,’ she says wistfully. 

Black women are among the most under-represented groups in South 
Africa’s academic melting pot. They make up 14% of the country’s 
researchers, compared with black men’s 18%. And they face tough odds. 
Inher 2015 article ‘Leadership: The invisibility of African women and the 
masculinity of power, Mamokgethi Phakeng writes’ that black women, 
as well as being marginalized for their gender and race by white society, 
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of research at one of Africa’s strong- 
est science institutions. Last October, an 
e-mail started circulating, questioning her 
qualifications. She took on her attackers, 
and vice-chancellor Max Price denounced 
the e-mail and its contents. 
Mamokgethi Phakeng The future of South A frica’s 
is ambivalent about university sector is uncertain. 
being a role model for Hlengwa worries that the 
young black academics. | momentum created by student 
protests might fizzle out with- 
out sustained change taking root. “While the 
heat was on, you had opportunities to work 
on transforming curricula,’ she says. But as 
universities learn to work with unrest, they 
snap back to old ways, she says. She also wor- 
ries that black academics are being run into 
the ground by the demands placed on them 
— from being called on to sit on diversity 
committees to giving advice on the complex 
challenges facing black students and staff. 
“Wheres the space for me to do some deep 
thinking about my research?” Hlengwa asks. 
It is a burden and a challenge. And 
Phakeng argues that it can be helped only 
by discourse. One of the things she has done 
since joining the University of Cape Town 
in mid-2016 is to speak to its black South 
African academics. For some, she says, 
it’s the first time they've been called on by 
management to share their experiences. “I ask people, what stories do you 
tell yourself? Those stories shape the possibilities of what we can do? m 


Linda Nordling is a freelance journalist in Cape Town, South Africa. 
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Protect the neglected half 
of our blue planet 


Maintaining momentum is crucial as nations build a treaty to safeguard the high 
seas, argue Glen Wright, Julien Rochette, Kristina M. Gjerde and Lisa A. Levin. 


t the close of 2017, 14 million UK 
Aw tuned into the acclaimed sec- 

ond series of David Attenborough’s 
Blue Planet, making it the year’s most- 
watched television show. It brought the won- 
ders of the ocean into people's living rooms 
and captured the public imagination as never 
before. Now is the time to capitalize on this 
enthusiasm, and to advocate for strong, 
legally binding protections for the high seas 
— the almost two-thirds of our planet’s ocean 
that are beyond the control of any one state 
(see ‘Neglected waters’). 


A start has been made. A landmark 
resolution was adopted at the United Nations 
General Assembly on 24 December 2017, 
marking the beginning of formal diplomatic 
negotiations for an international treaty to 
conserve and sustainably use the high seas. 
Co-sponsored by more than 130 nations, 
Resolution 72/249 is the result of more than 
a decade of scientific debates, legal contro- 
versies and political wrangling’. The deci- 
sion paves the way for a range of measures, 
including a much-needed system of global 
marine protected areas (MPAs) to sustain 


aquatic life in a rapidly changing ocean. 
Now we must ensure that real progress 
is made over the next few years. The treaty, 
expected some time after 2020, will need to 
include provisions for firm international 
oversight and direction if it is to have any 
chance of overcoming problems with the 
existing regulatory framework. As with any 
such negotiations, there is a risk that they 
will result in a toothless call for ‘urgent’ 
action and increased cooperation. 
Non-governmental organizations and 
environmental groups will continue to > 
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> push for strong conservation measures, 
including strictly protected reserves. The 
research community can contribute by 
speeding up the collection of baseline data 
to characterize existing environments and 
biodiversity; coordinating observation 
efforts across disciplines; monitoring and 
assessing ocean health; and further study- 
ing how MPAs and other conservation tools 
work on the high seas. Indeed, ocean sci- 
ence could be a unifying focus for this new 
agreement’. Social scientists, legal scholars 
and other experts can feed the negotiations 
with pragmatic options. 

Together, we must advocate for a strong 
international treaty if crucial high-seas 
ecosystems are to survive and thrive. 


PATCHY PROTECTION 

Governments have repeatedly made 
high-level political commitments to con- 
serve marine biodiversity. The Aichi 
Biodiversity Targets and the UN Sustainable 
Development Goals, for example, demand 
protection of 10% of the world’s ocean 
(although some scientists argue that at least 
30% is necessary’). This is to preserve wild 
spaces, sustain fisheries, protect the ecosys- 
tems that regulate the climate and preserve 
a wealth of biodiversity’. Governments have 
nonetheless been slow to act. Just 4% of the 
ocean is currently protected, and hardly any 
MPAs cover the high seas’. 

Marine areas beyond national jurisdiction 
are regulated by a patchwork of different 
agreements and institutions, each with their 
own peculiarities and pitfalls. Most of these 
organizations focus on the management of a 
particular resource or activity. The Interna- 
tional Seabed Authority (ISA) oversees sea- 
bed mining. Regional fisheries- management 
organizations regulate high-seas fisheries. 
And the International Maritime Organiza- 
tion sets out shipping rules. 

There are few channels of communication 
between these agencies, much less formal 
cooperation or coherence between their 
management measures. Their decisions are 
highly politicized, and the need to reach 
consensus among member countries can 
trump scientific evidence. 

Some regional initiatives have made 
limited progress. The OSPAR Commis- 
sion, named after its original conventions in 
Oslo and Paris, is composed of 15 countries 
and the European Union. It has designated 
ten MPAs in the high seas of the northeast 
Atlantic. However, these apply only to its 
member countries, and OSPAR does not 
have the authority to regulate many activi- 
ties or to ensure that conservation is part of 
fisheries decisions. 

In 2017, the ISA approved a 15-year 
exploration contract with Poland, covering 
part of the Mid-Atlantic Ridge. Within this 
OSPAR area sits the Lost City hydrothermal 


field, a unique range of 60-metre-tall calcium 
carbonate chimneys. The UN Educational, 
Scientific and Cultural Organization 
(UNESCO) and the International Union for 
Conservation of Nature (IUCN) have high- 
lighted that the site might meet the criteria 
for World Heritage status®. The ISA did not 
consult UNESCO, the IUCN or OSPAR, 
which left scientists who study this feature 
with no avenue for input other than writ- 
ing a letter of concern after the contract was 
approved. 

On the other side of the Atlantic, the 
Sargasso Sea Commission is attempting to 
protect a unique floating forest of Sargassum 
seaweed — recognized as an Ecologically or 
Biologically Significant Marine Area (EBSA) 
under the Convention on Biological Diversity. 
This recognition does not entail any man- 
agement measures, and only one of the few 
regulatory bodies in this high-seas region has 
shown an interest in implementing any. 

In the Southern Ocean, countries have 
worked together within the dedicated 
Antarctic Treaty System to designate part 
of the Ross Sea as the world’s largest MPA 
(1,549,000 square kilometres). This required 


NEGLECTED WATERS 


Almost two-thirds of the planet’s ocean are classed 
as international waters or high seas, meaning that 
they are beyond the control of any one state. 
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intense diplomatic efforts. Nonetheless, the 
resulting protections are limited, and the 
process to establish more MPAs recently 
stalled. 


CONSERVE AND CONNECT 

Research has shown that MPAs are effective 
if they are done right. Large, long-term, ‘no- 
take’ reserves that are isolated by deep water 
or sand and backed up with strong enforce- 
ment have five times more large-fish bio- 
mass than unprotected areas’. 
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Recent advances have greatly improved 
the evidence base for MPAs on the high 
seas, dispelling many common assump- 
tions about their feasibility and efficacy’. 
For example, scientists previously thought 
that species ranges were too big to desig- 
nate meaningful MPAs, but we now know 
that even wide-ranging deep-water species 
assemble to feed and spawn, and use par- 
ticular habitats for nurseries’. So strategically 
protecting just part of a species’ range could 
help to sustain populations'””. 

It is easy to find candidates for an initial 
suite of MPAs. UNESCO has identified 
5 possible high-seas World Heritage Sites; 
nearly 50 EBSAs cover portions of the high 
seas; fisheries bodies, following require- 
ments in UN resolutions, have identified 
‘vulnerable marine ecosystems’ susceptible 
to impacts from bottom trawling; and the 
ISA is identifying ‘areas of particular envi- 
ronmental interest. These designations cover 
a broad range of habitats, from deep-water 
coral grounds to abyssal plains, and are 
grounded in scientific criteria — including 
a site’s uniqueness, productivity, complexity 
and fragility. 

Protecting such sites is a start, but will not 
insure the ocean against the many threats it 
faces. A wider network of representative and 
connected MPAs will be needed to provide 
resilience to climate change and to main- 
tain biodiversity by ensuring links between 
migration routes and spawning grounds”. 
No one has worked out where, how large or 
how deep these areas should be. Comple- 
mentary measures and better management 
will also be needed for the ecosystems and 
activities that fall outside MPAs. 

Calling for protection of a swathe of the 
high seas might seem starry-eyed. But some 
have made the case for entirely closing the 
high seas to fishing, arguing that this would 
lead to greatly increased fisheries yields and 
profits overall’. 

More research will be required if we are 
to protect these deep and distant seas effec- 
tively. Ramping up basic research efforts 
to improve baseline data is crucial, as is 
improving our understanding of how climate 
change and other stressors affect invaluable 
ecosystem services. We will need to better 
coordinate and expand existing observa- 
tion programmes, improve data access and 
promote training for young scientists. Next- 
generation molecular, computing, telemetry 
and observing technologies must also be 
developed and applied. 

Some progress is being made here: research- 
ers are developing techniques for growing 
deep-sea organisms in the laboratory, shed- 
ding light on their reproductive traits and 
biology. The UN has declared 2021-30 as 
the Decade of Ocean Science for Sustainable 
Development. This should help to mobilize 
the scientific community around these issues. 


SOURCE: NATIONAL GEOGRAPHIC 


DAVID FLEETHAM/BARCROFT/GETTY 


Great white sharks (Carcharodon carcharias) often visit an area of the Pacific Ocean dubbed the White 
Shark Café. It is one of five areas proposed as a high-seas World Heritage Site. 


The questions of who will designate 
MPAs and how management measures will 
be implemented are politically charged. The 
strongest possible outcome of the upcoming 
treaty negotiations, from a legal perspective, 
would be a new UN body with wide powers 
to make binding, top-down decisions, work- 
ing in concert with existing organizations. At 
the other end of the spectrum, states could 
be left to cobble together MPAs within the 
existing system, with the new agreement 
providing some form of obligation and over- 
sight. The former would provide a powerful 
means of protecting this important global 
commons; the latter might leave conserva- 
tion beholden to the failings of the current 
framework, with states likely to continue 
dragging their feet. 

A balance will need to be found. To be 
effective, the new instrument must provide 
sufficient international oversight, while 
respecting the mandates of existing organi- 
zations and ensuring that a majority of states 
is prepared to sign up. 

What is certain is that individual states 
will remain responsible for controlling 
ships flying their flag. Proactive states could 
therefore agree to work collectively through 
the new treaty to protect priority places by 
controlling the biodiversity impacts of their 
vessels, while encouraging non-parties to do 
the same. However, the negotiations are not 
intended to address ‘flags of convenience, 
whereby a country registers vessels on a ‘no 
questions asked’ basis, generally in exchange 
for a fee. 


GENETIC GOLDMINE 

Marine protection is only one part of the 
treaty discussions. The question of how to 
regulate the exploitation of marine genetic 
resources also promises to be both techni- 
cally and politically challenging. 

Genes extracted from marine creatures in 
the high seas are being used to develop new 
pharmaceuticals and cosmetics. There is 
currently no requirement to share the profits 
that arise from the exploitation of this com- 
mon resource. 

The few states that have the capacity to 
conduct bioprospecting are keen to maintain 
the status quo, which is essentially first come, 
first served. Others want to create a formal 

mechanism for shar- 
ing the profits, simi- 
lar to a system already 
being put in place for 
seabed mining. The 
long and complicated 
chain of discovery 
makes it difficult to 
capture any monetary 
benefits. And the dis- 
tinction between bioprospecting and ‘pure’ 
scientific research, which is permitted by 
the UN Convention on the Law of the Sea, is 
often far from clear. 

Researchers can help here, too, such as by 
improving open-access protocols for data 
and samples. Even in the absence of a com- 
prehensive regulatory framework or an obli- 
gation to share profits, a new treaty could 
still include helpful provisions that would 


promote international science cooperation, 
capacity building and the development and 
transfer of marine technology”. 

Turning good intentions into an effective 
treaty with meaningful MPAs will take time, 
money and scientific input. There is uncer- 
tainty regarding the position and role of the 
United States, which has not ratified the UN 
Convention on the Law of the Sea. Devel- 
oping countries are calling for greater assis- 
tance, which will require developed nations 
to commit considerable resources. But with 
the majority of countries now in favour of 
anew agreement, momentum is building. 

Tough diplomatic negotiations might 
nonetheless be necessary to reach consen- 
sus on the finer details of the new treaty. The 
beauty and value of our ocean could be lost 
all too easily in the windowless halls of the 
UN’s New York headquarters, obfuscated 
by realpolitik and the arcane details of inter- 
national law. Political leaders will need to see 
strong science and public support if they are 
to develop an ambitious agreement to finally 
protect the neglected half of our blue planet. m 
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France. Julien Rochette is director of the 
oceans programme at IDDRI. Kristina 

M. Gjerde is senior high-seas adviser to the 
International Union for Conservation of 
Nature in Gland, Switzerland. Lisa A. Levin 
is distinguished professor of biological 
oceanography at the Scripps Institution of 
Oceanography, University of California, San 
Diego, La Jolla, California, USA. 

e-mail: glen.wright@iddri.org 
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BOOKS & ARTS 


The west coast of Greenland is fertile ground for geologists. 


Epiphanies of the edgelands 


Ted Nield admires a geologist’s layered exploration of Greenland’s remote fringe. 


eologist and former surf dude 
(Git Glassley has spent six field 
seasons studying the ancient rock of 
coastal Greenland. As he probes our plan- 
et’s youth, three billion years ago, many will 
envy him. His brief but ambitious A Wilder 
Time demonstrates that there’s nothing like 
geology for acquainting you with the joys of 
remote isolation at other people's expense. 
The area he explores, with Danish col- 
leagues Kai Sorensen and John Korstgard, is 
vast: part of the coastal fringe of ice-smoothed 
rock and periglacial tundra that extends like a 
valance around Greenland’s enormous central 
ice cap. There is sea to the west, crumbling ice 
cliffs 150 kilometres and more to the east. A 
Wilder Time sees our heroes marooned in this 
wilderness, alone in the short summer’s per- 
petual day. Glassley eloquently evokes a place 
where land feathers into Arctic sea, ice floes 
glide by on mirror-smooth tongues of clear, 
frigid water and silence reigns. 
What drew the companions there 
might sound, by contrast, like a storm in 
an academic teacup. Someone (tactfully 


left unnamed) had ~~ 
published a paper |. “tesa | 
attacking the estab- is w A 
lished geological view | ILDER ! 
that the study area TIME Ss 
— between Nordre seem 

Isortoq in the south 
and Disko Island to 
the north — is part of 
the roots of an ancient 


mountain range, the pe Yale Ulli: 

a Notes froma 
Nagssugtoqidian Geologist at 
mobile belt. Geologists the Edge of the 
are familiar with these Greenland Ice 
Inuit place-names, WILLIAM E. GLASSLEY 
many ending in ‘o q. rie Literary Press: 


Pronunciation should 
sound, the late Stephen 
Moorbath (an isotope geochemist and geo- 
chronologist) once told me, “like a piano 
string being cut at the bottom of the ocean”. 
Moorbath helped to make the area famous 
by finding what are still among the oldest 
known rocks on Earth, almost 3.8 billion 
years old. In the 1960s and 1970s, geologists 
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Arthur Escher and Juan Watterson mapped 
these high-grade metamorphic melanges of 
altered sediments, mantle rocks and ocean- 
floor basalts. In the 1980s, Feiko Kalsbeek, 
Bob Pidgeon and Paul Taylor interpreted it 
all in the light of plate tectonics. The distinc- 
tive east-west shear zones that transected 
the region were, they said, sutures left by 
the most ancient plate-tectonic collisions on 
our planet, during the early Proterozoic eon, 
which began around 2.5 billion years ago. But 
seeds of doubt were cast. The new paper made 
fundamental challenges to earlier interpreta- 
tions that seemed themselves so misguided 
and riddled with errors and misconceptions 
that they could not go unanswered. 

Such is the scientific narrative under- 
pinning A Wilder Time, whose rather over- 
complicated structure arrives at a satisfying 
conclusion. The epilogue demonstrates how 
Glassley’s team confirmed and even refined 
the original interpretation of the mobile belts, 
putting its assailants to flight. 

This story offers perspectives on deep time 
to boggle minds, from the immense ages of 


FERNANDO MOLERES/PANOS 


the rocks and events concerned. Metamor- 
phic petrology is no easy material for popu- 
lar science. By the time you explain how the 
phyllites, schists and gneisses started life, why 
they were taken to the depths of Earth’s crust 
and how their minerals were changed under 
combinations of heat and pressure (each pro- 
ducing distinctive suites of new minerals), 
many readers will have lost interest. Wisely, 
Glassley doesn't try too hard — which is fine, 
because the science is almost a narrative ploy. 

Natural scientists may be the only intel- 
lectuals these days who find themselves 
routinely exposed to the transformative 
experience of wilderness. Yet (as I have seen 
during desert fieldwork in the Middle East) 
on many of them it seems wasted. This may 
not be their fault. Expeditions, such as Glass- 
ley’s, are a lesson in how travel can narrow the 
mind. It is hard enough to keep focused on 
the work when trying to cope with midges, 
heat, cold, disorientation, altitude and home- 
sickness, never mind dehydration, disgusting 
camp food and the physical consequences. It 
takes a deep attunement to the wild’s allure to 
keep appreciating the surroundings. 

Glassley’s vivid impressions of East Green- 
land attempt what few scientist-writers try: to 
explore beyond the comfort zone of his field. 
Followers of the medieval philosopher Duns 
Scotus coined the terms haecceity (‘thisness, 
of specific things) and quiddity (‘whatness, 
of the classifier). These are ‘science’ Glassley 
tries also to grasp something beyond: the 
noumenon, an ineffable inner reality in things 
that cannot be discerned by the senses. 

Not everyone experiences this psychologi- 
cal epiphany. Scientists sometimes have it 
trained out of them by the relentless insistence 
on cold measurement. Glassley, by contrast, 
seems obsessed with our limitations when it 
comes to grasping the wholeness of the world. 
He questions, for instance, how our ‘reality’ 
contrasts with, say, a seal’s, or a fish's. Absent- 
ing himself from camp, he wanders alone with 
his reflections, and attempts closer commun- 
ion with the hidden genius of place. 

Although he repeatedly explains what he’s 
attempting (a scientist’s tendency to write 
abstracts for everything?), he is not always 
successful; yet I enjoyed and admired the 
attempt. What he gropes for requires art, not 
analysis. Perhaps that was why I kept return- 
ing to Hugh MacDiarmid’ great 1934 poem, 
‘Ona Raised Beach, which explores the limi- 
tations of science in expressing the whole- 
ness of nature. After an opening parody of 
scientific language, the poet observes: “Deep 
conviction or preference can seldom/Find 
direct terms in which to express itself” m 


Ted Nield is editor of Geoscientist and 
author of Supercontinent. In an earlier life, 
he, too, sensed the noumenon in remote 
places at other people’ expense. 

e-mail: ted.nield@geolsoc.org.uk 
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Books in brief 


The Tyranny of Metrics 

Jerry Z. Muller PRINCETON UNIVERSITY PRESS (2018) 

Economic historian Jerry Muller delivers a riposte to bean counters 
everywhere with this trenchant study of our fixation with performance 
metrics — a cultural ubiquity saturating education, medicine, finance 
and governance. As he argues, this reductive approach to monitoring 
efficiency almost inevitably backfires. It can lower morale by riding 
roughshod over professionals’ experience; invite manipulation, from 
“gaming the stats” to “teaching to the test”; discourage innovation, 
promote short-termism; and reward dumb luck. Metrics, he asserts, 
can usefully bolster judgement, but not supplant it. 


The Food Explorer 

Daniel Stone DUTTON (2018) 

n the annals of intrepid botanists combing the globe for novel 
species, David Fairchild is a name to conjure with. At the turn of the 
twentieth century, the plant scientist introduced 200,000 ‘exotic’ 
species to the United States, then something of a culinary blank slate. 
Kale, avocados, mangoes, pomegranates and even quinoa began 

0 work their way into US consciousness and, eventually, markets. 
Daniel Stone’s rip-roaring tale takes us from Fairchild’s youthful 
meeting with naturalist Alfred Russel Wallace in Kansas to collecting 
rips across more than 50 countries, from Trinidad to China. 


The Future of Humanity 

Michio Kaku DOUBLEDAY (2018) 

This latest foray into futurism by Michio Kaku sees the physicist 
unbowed by woes political and planetary. As a master of the long 
view, Kaku plots humanity’s path to becoming a “multiplanet 
species”. He marshals fresh advances in artificial intelligence, 
nanotechnology and bioengineering for his vision, segueing from 
lunar stations and Martian colonies to interstellar travel and human 
genetic engineering. There’s plenty of hypothetical innovation, too: 
ramjet fusion machines, antimatter engines and “laser porting” of 
human connectomes to enable bodiless exploration of the cosmos. 


A Shadow Above: The Fall and Rise of the Raven 

Joe Shute BLOOMSBURY (2018) 

Size-wise, the king of corvids is the raven. But for journalist 

Joe Shute, the bird is also an emblem of our age, caught between 
the ebb of wilderness and the hope of regeneration. In Britain, after 
a long, catastrophic decline, numbers have bounced back by 45% 
over the past two decades. Celebrating that fact, Shute gets inside 
the skin of the ‘feathered ape’ with the “rhino-horn beak” and 
aerial virtuosity. That journey becomes a rich and beguiling tangle 
of cultural and natural history, birding diary and account of corvid 
fandom — Charles Dickens being one notable devotee. 


Making the Monster 

Kathryn Harkup BLOOMSBURY SIGMA (2018) 

Chemist Kathryn Harkup’s scientific contextualization of 

Mary Shelley’s Frankenstein at 200 is a worthy addition to a crowded 
shelf. She explicates how trailblazing discoveries in galvanism, 
chemistry and anatomy helped to form the bones of the book, while 
its heart beat to the rhythm of Shelley’s radical intellectual lineage 
and milieu. Harkup’s handling of Shelley’s own story and the literary 
alchemy wrought by this brilliant teenager compels, not least on 
how the science fiction has seeped into science fact. Barbara Kiser 
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Into the wilds of brain boosting 


Trevor Robbins lauds a personal journey aimed at cracking cognitive enhancement. 


diverse mental abilities, from language 

to mental arithmetic? Or do these skills 
compete for our brains’ limited resources? 
In The Genius Within, David Adam tackles 
the controversial topic of intelligence, in a 
follow-up to his exceptional book on obses- 
sive-compulsive disorder (OCD), The Man 
Who Couldn't Stop (Picador, 2014). 

Adam (Nature’s Editorials editor, with 
whom I worked on a film about the use of 
animals in OCD research) deftly surveys 
attempts to test intelligence starting more 
than a century ago, by statistician and eugeni- 
cist Francis Galton and others. Psychologist 
Charles Spearman introduced gas a measure 
of overall performance across, for instance, 
verbal and spatial capacities. Alfred Binet 
introduced the first practical test for intelli- 
gence quotient (IQ), purporting to determine 
achild’s ‘mental age’. In the ensuing decades, 
intelligence testing has come under fire 
for cultural bias, as well as limited scope; it 
excludes ‘emotional intelligence; for example. 
Adam explores the dissent, from calls for rec- 
ognition of neurodiversity to the furore over 
Richard Herrnstein and Charles Murray’s 
1994 book The Bell Curve (Free Press). 

Adam's book hinges on one question: can 
we beef up our intelligence? Ever perspi- 
cacious and intrepid, Adam turns guinea 
pig, exploring means from ‘brain train- 
ing’ to cognitive-enhancer drugs. He takes 
modafinil (licensed for treating the sleep 
disorder narcolepsy) and finds, consistent 
with lab studies, enhanced mental focus as 
he works: “I did feel different ... like I was 
concentrating on the words I wrote ina 
more deliberate way.’ He also endures bouts 
of crude electrical brain stimulation that 
simulates the effects of transcranial direct 
current stimulation (tDCS), administered by 
his spouse. In the lab, tDCS has been claimed 
to promote learning by exciting or inhibiting 
neural activity across specific regions of the 
cortex. 

Adam concludes there is evidence that 
intelligence can be boosted. Improving 
cognitive performance is important in 
people affected by Alzheimer’s disease, for 
instance, or conditions such as attention defi- 
cit hyperactivity disorder. But he argues that 
enhancing ‘normal’ performance could also 
bea viable goal. The claim that most people 
perform optimally is palpably false: think of 
the effects of fatigue on exam performance. 
Moreover, surveys reveal that students and 
academics use cognitive-enhancer drugs. 


I: there a common element that binds 


Robert Klark Graham ran a ‘genius sperm bank’. 


Yet successful cognitive enhancement carries 
myriad ethical implications. Is it fair to take 
modafinil while studying for exams? Could 
an employer insist on treatment for employ- 
ees? How would all this be regulated? 
Importantly, does cognitive improvement 
carry a neurological price? Is it possible to 
enhance g, or does boosting one aspect of 
intelligence, such as working memory, 
degrade another? A study last year found 
that modafinil can enhance the performance 
of people playing chess against computers, 
but they lose more games through time 
defaults (A. G. Franke et al. Eur. Neuro- 
psychopharmacol. 27, 248-260; 2017). 
Adam makes several fascinating digres- 
sions into the grisly underbelly of cognitive 
neuroscience. The Repository for Germinal 
Choice, for instance, was eugenicist Robert 
Klark Graham’s 
scheme for selling 
sperm from Nobel 
prizewinners. The 
nineteenth-century 
Society of Mutual 
Autopsy sought to 
advance neurosci- 
ence by dissect- 
ing dead members’ 
brains. Adam reveals 


how assessment of Ub haasa ki 
death-row prisoners’ UN'S Mocking 
eal hal Your Brain’s 

inte ect can ha t Or Potential 
implement, execution. — pavip ADAM 


Most remarkable Pegasus: 2018. 
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are his discussions of extraordinary 
mathematical or musical abilities in people 
with, for instance, autism or brain damage. 
This sort of neurodiversity highlights a need 
for clearer understanding of whether specific 
brain regions actively compensate for loss 
of function due to damage, or whether the 
capacity of one brain region is unmasked 
by a difference in another. High-resolution 
neuroimaging methods may begin to answer 
these questions. Functional imaging studies 
show that the brain’s grey and white matter 
are both capable of considerable plasticity. 
Description of underlying brain mecha- 
nisms is not the province of The Genius 
Within; these will eventually deserve an 
authoritative account. 

Adam also ventures into commercial brain 
games and apps, which challenge attention 
and memory. Computerized training of 
working memory (through, for instance, 
remembering the locations of objects in 
grids) has been claimed to improve cogni- 
tion in both healthy people and those with 
mental illnesses. Given that working mem- 
ory is closely related to IQ, such training 
might be expected to produce more-general 
improvements in intelligence. However, con- 
troversial attempts at controlled trials have 
sometimes cast doubt on this. 

I enjoyed The Genius Within enormously. 
Eminently readable, it takes a relatively 
holistic ‘mind-body’ view of our abilities, 
beyond our capacity to decipher rotating 
polygons or shuffling anagrams. Dancers 
and sportspeople have obvious ‘motor intel- 
ligence’: executing complex choreography or 
turning a football game on a visionary pass 
demand visual and spatial acuity, exqui- 
site coordination and seamless interaction 
with others. So Adam’s discussion of how 
electrical stimulation of the motor cortex 
improves the performance of road cyclists is 
fascinating (even as it raises questions about 
detection). 

1Q testing will continue to be criticized. 
We would do better to take into account a 
broader range of cognitive capacities, with 
more-sophisticated measures of behaviour 
ina range of situations. And we must work 
harder to obtain an optimal balance of these 
capacities by fine-tuning our brains through 
ever-evolving methods. m 


Trevor Robbins is professor of cognitive 
neuroscience at the University of 
Cambridge, UK. 

e-mail: twr2@cam.ac.uk 
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Existing rules cover 
gene-drive usage 


Gene-drive technology is not 
unregulated, as you imply 
(Nature 552, 6; 2017). Because 
it involves genetically modified 
(GM) organisms, it is covered in 
countries that have regulations 
on gene modification and 
internationally by the Cartagena 
Protocol on Biosafety. 

It could be argued that the risks 
are not comparable for contained 
laboratory use versus deliberate 
release of GM organisms into the 
wild. This assumes that lab safety 
standards based on pathogenicity 
would be inadequate for non- 
pathogenic gene-drive organisms. 
However, European regulations, 
as well as, for example, German 
law, put protection of the 
environment on a par with 
protecting human health, even for 
contained usage. The potential of 
GM organisms to persist in the 
environment and spread into wild 
populations has always been a 
crucial part of risk assessment for 
transgenic organisms. 

Existing regulations therefore 
cover environmental risks 
arising from contained handling 
of gene-drive organisms, as 
confirmed by the German 
Central Commission for 
Biological Safety (see go.nature. 
com/2enrjy4). Researchers in 
Germany and the Netherlands 
need permission for gene-drive 
experiments. Risk assessment 
is then made ona case-by-case 
basis. 

Swantje Strassheim, Werner 
Schenkel Federal Office of 
Consumer Protection and Food 
Safety, Berlin, Germany. 
swantje.strassheim@bvl.bund.de 


Bitcoin’s alarming 
carbon footprint 


The ‘mining’ process for the 
cryptocurrency bitcoin is 
power hungry, and is increasing 
its environmental impact as 

its price and popularity rise. 
Cryptocurrencies are generated 
by specialized software, used 


to solve complex mathematical 
problems that represent proof- 
of-work algorithms in exchange 
for electronic coins (see https:// 
bitcoin.org/bitcoin.pdf). 

Some estimate that the 
combined electricity consumption 
for bitcoin and ethereum mining, 
which together represent 88% of 
the total cryptocurrency market 
capitalization (G. Hileman and 
M. Rauchs http://doi.org/cj22; 
2017), has already reached a 
staggering 47 terawatt-hours per 
year and is on the rise (see www. 
digiconomist.net). To put this into 
perspective, Greece's population 
of 11 million consumes close to 
57 terawatt-hours annually. 

Moreover, 58% ofall 
cryptocurrency mining is done 
in China and is typically powered 
by coal plants. Using the life-cycle 
impact-assessment methodology, 
I estimate that the annual 
carbon footprint for bitcoin and 
ethereum mining is comparable 
to that of some 6.8 million average 
European inhabitants — or as 
much as 43.9 million tonnes of 
carbon dioxide equivalent (see 
ReCiPe and IPCC 2013 methods, 
respectively, at go.nature. 
com/2nn7zzj). 

In my opinion, the 
cryptocurrency industry is 
urgently in need of reform 
to make it environmentally 
sustainable. 

Spyros Foteinis PPC 
Renewables, Athens, Greece. 
sfoteinis@ppcr.gr 


Baleen whale species 
at risk of extinction 


The latest Critically Endangered 
list from the International 
Union for Conservation of 
Nature includes the Gulf of 
Mexico whale, a subspecies 

of Balaenoptera edeni (see 
go.nature.com/2bdntor). This 
mammal is at risk of being the 
first baleen whale to go extinct 
since the Atlantic grey whale 
(Eschrichtius robustus) three 
centuries ago. Yet the animal's 
new status has generated little 
public response. 


The Gulf of Mexico whale 
is similar to Bryde’s and Eden's 
whales (both also named 
B. edeni), but is genetically 
distinct from both. It is entirely 
confined to US waters in the 
Gulf of Mexico (see go.nature. 
com/2bdntor). Survey data put 
its abundance at 33 individuals 
in 2009, and modelling suggests 
that almost half its habitat was 
affected by the Deepwater 
Horizon oil spill in 2010 (see 
go.nature.com/2e6joqge). 

Rapid action is needed to 
eliminate sources of human- 
induced death and injury among 
these whales. A first step must be 
to raise society's and scientists’ 
awareness of their status. 

Peter Corkeron Northeast 
Fisheries Science Center, Woods 
Hole, Massachusetts, USA. 
Scott D. Kraus New 

England Aquarium, Boston, 
Massachusetts, USA. 
peter.corkeron@noaa.gov 


Value and reward 
regional research 


Incentives for publishing in 
international journals could 

be preventing ecologists in 
low-income countries from 
conducting the research needed 
to protect and restore their local 
environments. Few scientists are 
willing to do time-consuming 
taxonomic surveys, for example, 
because these will not generate 
highly cited publications. 

Yet effective management is 
impossible without such local 
ecological insight. 

Although reward structures 
for research vary substantially 
between and within countries, 
they are often based on scientists’ 
publication and citation counts 
in internationally recognized 
journals. In Mexico, for instance, 
this encourages research that 
appeals to reviewers and editors 
in distant countries, fosters 
publication in journals that are 
financially and linguistically 
inaccessible, and may not be 
relevant to local problems 
(M. W. Neff Sci. Public Policy 


http://doi.org/cjz2; 2017). 
Journals that are regionally 
relevant and in languages other 
than English suffer because top 
scientists eschew them, leaving 
university students, resource 
managers and policymakers with 
fewer resources. 

Mexico’ national research 
policies provide clear examples 
of distorting incentives, but the 
problem is close to universal: 
what is countable is not always 
what we should be counting. 
Scientists and publishers need to 
exert their power to change these 
systems. 

Mark Neff Western Washington 
University, Bellingham, USA. 
mark.neff@wwu.edu 


Don’t belittle junior 
researchers 


The most interesting part of a 
scientific seminar, colloquium 
or conference for me is the 
question and answer session. 
However, I find it upsetting 

to witness the unnecessarily 
hard time that is increasingly 
given to junior presenters at 
such meetings. As inquisitive 
scientists, we do not have the 
right to undermine or denigrate 
the efforts of fellow researchers 
— even when their reply is 
unconvincing. 

It is our responsibility to 
nurture upcoming researchers. 
Firing at a speaker from the 
front row is unlikely to enhance 
discussions. In my experience, 
it is more productive to offer 
positive queries and suggestions, 
and save negative feedback for 
more-private settings. 

With belligerence supplanting 
courtesy inside and outside the 
conference room, it might be 
helpful for young researchers 
to be taught how to frame a 
question in a purely scientific 
way. Let us create a system in 
which junior scientists feel 
excited to share their data. 
Anand Kumar Sharma CS/R- 
Centre for Cellular and Molecular 
Biology, Hyderabad, India. 
anandkumar@ccmb.res.in 
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BRIEF COMMUNICATIONS ARISING 


Contesting early archaeology in California 


ARISING FROM S. R. Holen et al. Nature 544, 479-483 (2017); doi:10.1038/nature22065 


The peopling of the Americas is a topic of ongoing scientific interest 
and rigorous debate’. Holen et al.* add to these discussions with their 
recent report of a 130,000-year-old archaeological site in southern 
California, USA: the Cerutti Mastodon (CM) site, which includes the 
fragmentary remains of a single mastodon (Mammut americanum), 
spatially associated stone cobbles, and associated lithic debris that 
they claim indicates prehistoric hominin activity. In sharp contrast, we 
contend that the CM record is more parsimoniously explained as the 
result of common geological and taphonomic processes, and does not 
indicate prehistoric hominin involvement. Whereas further investiga- 
tions may yet identify unambiguous evidence of hominins in California 
around 130,000 years ago, we urge caution in interpreting the current 
record. There is a Reply to this Comment by Holen, S. R. et al. Nature 
554, http://dx.doi.org/10.1038/nature25166 (2018). 

Holen et al.3 claim prehistoric hominin involvement at the CM site 
based primarily on four lines of evidence: a reliable radiometric age; the 
presence of stone artefacts; clear evidence of tool-imparted percussion 
damage to the remains of the mastodon; and an undisturbed geological 
context. We take no issue with the published age for the site, but we 
believe that the latter three claims warrant further examination. 

The CM site stone artefacts are an assortment of cobbles and frac- 
tured cobble fragments excavated from a sandy silt matrix. There is 
no evidence of formal stone tool forms or diagnostic lithic micro- or 
macro-debitage. Instead, the CM site artefacts are identified by their 
proximity to the remains of the mastodon and their large size relative 
to the enveloping sediment. Additionally, surface features such as the 
presence of pitting and scratching on cobble surfaces, the presence of 
several cobble fragments with fracture morphologies reminiscent of 
hammerstone and/or anvil usage, and the presence of several refitting 
cobble fragments are interpreted as evidence of hominin percussive 
activities on-site. The lack of discarded formal tools and diagnostic 
lithic debitage is noteworthy, and is unusual relative to most archaeo- 
logical assemblages that purport hominin processing of proboscidean 
remains (although see Haynes“). We also note that upslope of the 
site there are numerous alluvial fans, with clastic material a common 
occurrence. The cobbles and pebbles at the CM site can be derived 
from modest alluvial fan input with fines subsequently winnowed with 
lower energy fluvial erosion. Crucially, none of the criteria that Holen 
et al.’ use to define stone artefacts either requires prehistoric hominin 
involvement or meets the accepted criteria for falsifying natural 
‘geofacts’*. The range of possible geological interpretations for the lithic 
assemblage highlights the critical issue of equifinality, in which an end 
product such as a shattered cobble may be generated via many and 
potentially unrelated means. It is a principle that applies to the bone 
record as well. 

We contend that Holen et al.* presented equivocal evidence in 
support of tool-imparted percussion damage to the remains of the 
mastodon. The critical observations are of spiral fractures, cone 
flakes, impact flakes, bulbs of percussion, impact notches, negative 
flake scars, anvil-polished specimens, percussion-fractured specimens, 
and refitting specimens. These bone damage features are inferred 
to implicate sole agency by prehistoric hominins. As with the stone 
artefact record, issues of equifinality must first be addressed, including 
the question of whether other processes could produce such bone 
damage. 

Haynes*® presents compelling evidence of non-cultural and/or 
non-prehistoric processes producing comparable damage at the 


24,000-year-old Inglewood Mammoth Site (IMS), Maryland, USA. 
As at the CM site, the IMS contains the remains of a single juvenile 
proboscidean recovered in situ from a sealed deposit of sandy clays 
with pebbles and cobbles®. Haynes® provides descriptions and images 
of curvilinear and spiral ‘green-bone’ fractures on cranial, axial and 
appendicular specimens. Some of these fractures are recent in origin, 
probably related to heavy earthmoving equipment working on-site®. 
Other damage may reflect perimortem injuries sustained by the 
mammoth. No evidence of prehistoric hominin activities are noted 
or suspected for the site. Post-burial bone notches, impact points and 
impact scratches occurred on a number of specimens. 

We report a similar record of fractured proboscidean bones at the 
Waco Mammoth National Monument (WMNM), Waco, Texas, USA. 
The site contains the remains of at least 26 mammoths buried in fluvial 
sands, silts and clays, and dates from 66,800 to 51,300 years ago’. The 
WMND\M was initially investigated as a potential archaeological site, 
although no evidence of prehistoric human activities was discovered. 
Figure 1 shows post-burial damage to WMNM mammoth long bones 
morphologically consistent with observations from the IMS and CM 
sites. This includes damage that resembles spiral fractures with asso- 
ciated sedimentary abrasion, hammerstone pseudo-notches*, negative 
flake scars, partially detached flakes and incipient notches, and bulbs 
of percussion. Such damage, including spiral fractures, is well known 
in the fossil record from as early as the Triassic period? and can occur 
post-burial®. They neither require nor solely indicate prehistoric haman 
agency, 

Other proboscidean assemblages share a similar taphonomic sig- 
nature with the WMNM, IMS and CM sites. Holen and others report 
various combinations of spiral fractures, impact points, bulbs of per- 
cussion and bone flakes at numerous other late Pleistocene mammoth 
death sites in the Americas!!!. As with the CM site, these latter assem- 
blages uniformly lack unambiguous stone tools, cut marks, or any 
other unquestionable evidence of hominin activities, and most predate 
well-vetted geochronological and palaeogenomic evidence of the initial 
peopling of the Americas around 15.5 thousand years ago??!7-"4, 

Moreover, it is not just what is present at the CM site, but also what 
is missing. Specifically, hammerstone striae and/or pits (HSSP)° are 
noticeably absent despite reasonable cortical bone preservation, several 
hundred bone fragments, purported hammerstones, and purported 
anvil abrasions on both the bones and the cobbles. Experimental studies 
show that hammerstone-percussed proboscidean limb bone fragments 
should bear HSSP on greater than 30% of specimens created when 
using a hafted hammerstone and anvil'?. The absence of HSSP at the 
CM site—a proposed percussed bone assemblage—cannot be explained 
using current experimental models and contradicts the assumption of 
hammerstone-wielding hominin involvement in bone breakage. 

Lastly, we question the assertion of an “undisturbed geologic 
context” at the CM site. Although the distance between some refitted 
finds necessarily suggests pre-burial breakage and scattering of some 
items (for example the molar fragments), other features of the record 
plausibly reflect subsequent forces modifying the assemblage over the 
last 130,000 years. As fluvial deposits slowly covered the remains, the 
bones of the mastodon would have remained semi-pliable for years®. 
Proboscideans or other large mammals subsequently using the muddy 
watercourse could potentially trample, displace, fracture, abrade and 
reorient (for example the semi-vertical tusk) the interred materials*®. 
Later sediment compaction by metres of overburden and then eventual 
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Figure 1 | Fractured proboscidean bones from WMNM. a, Curvilinear 
fracture between refitted fragments BU-MMC-641a and BU-MMC-641b. 
b, The opposite side of the refitted fragments depicted in a. 

c, Hammerstone-like ‘micro’-notch® on the cortical surface without 
diagnostic percussion pit, fragment BU-MMC-101 1a. d, Negative flake 
scar on the same bone as that depicted in c. e, Post-burial curvilinear 
fracture on fragment BU-MMC-210a. f, Bone flake with bulb of 
percussion, fragment BU-MMC-642b. g, Comminuted fracture with 
refitted flakes and associated sedimentary abrasion on fragment 
BU-MMC-7 6a. Scale bars, 5cm. 


disturbance by heavy construction equipment (see supplementary 
information 6 of ref. 3) would confound the taphonomic history of the 
site further, as it has at both IMS and WMNM*°. 

The extraordinary claim by Holen et al.° of prehistoric hominin 
involvement at the CM site should not be contingent on evidence that 
is open to multiple, contrasting interpretations. Until unambiguous 
evidence of hominin activities can be presented, such as formal stone 
tools or an abundance of percussion pits, caution requires us to set 
aside the claims of Holen et al.’ of prehistoric hominin activities at 
the CM site. 
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Methods 

The Baylor University Mayborn Museum Complex (BU-MMC), Waco, Texas, 
is the official repository for around 70% of the recovered WMNM remains, 
with the remainder left in situ for display at the WMNM site. Approximately 
1,100 trays of curated fossils are available for study, with most trays containing 
multiple specimens. Individual specimens are labelled here based on tray number 
(BU-MMC), followed by a letter designation (for example 642a, 642b). Specimens 
were selected based on gross bone damage morphologies, with the aim of recording 
damage similar to that reported from the CM site. Images were obtained using a 
Cannon EOS Rebel XS digital camera. 

Data availability. All data are available from the corresponding author upon 
reasonable request. 
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Holen et al. reply 


REPLYING TO J. V. Ferraro et al. Nature 554, http://dx.doi.org/10.1038/nature25165 (2018) 


Contrary to our hypothesis’ that the Cerutti Mastodon (CM) site rep- 
resents a 130,000-year-old archaeological site, in the accompanying 
Comment? Ferraro et al. argue that the site formed through ‘common 
geological and taphonomic processes. As a source for the cobbles that 
we interpreted as hammerstones and anvils, they postulate a previ- 
ously unrecognized alluvial fan, later removed by fluvial winnowing 
that somehow left our five cobbles, refitting flakes, and fragments of 
stone, mastodon bone and teeth in place. There is no sedimentological 
or geomorphic evidence of an alluvial fan, and their scenario leaves 
unexplained a number of taphonomic features, including the two 
discrete concentrations in which were found cobbles, refit stones and 
bones, impact-fractured bones, side-by-side femoral heads and a tusk 
oriented vertically. 

Ferraro et al.” also speculate that the stone and bone fractures that we 
analysed can be explained by post-burial processes such as sediment 
compaction or interaction with excavation equipment, whereas we con- 
tend that these features are part of the CM biostratinomic (pre-burial) 
record. Support for our view is provided by the fact that most CM bones 
and stones were enclosed within crusts of pedogenic carbonate that 
establish a ‘chain of evidence’ showing that breakage and positioning 
of objects occurred many thousands of years ago, and, as we contend, 
before burial*. The only pre-burial cause of bone breakage Ferraro 
et al.” consider is trampling, which we have argued is incompatible 
with other site data!. 

Ferraro et al.” draw comparisons to the Inglewood Mammoth Site 
(IMS)*5 and the Waco Mammoth National Monument (WMNM)°. For 
the IMS, they cite an observationally based study* that proposes that 
excavating equipment caused the spiral fractures on many of the bones. 
However, this claim is compellingly refuted by an experimentally based 
study” that shows that the IMS spiral fractures are ancient after all, and 
probably occurred before burial. 

WMN\ bones illustrated by Ferraro et al. (figure 1 of ref. 2) lack 
clear evidence of true spiral fractures or normal impact notches’, 
instead representing classic examples of dry bone fracture, with rough 
texture on fracture surfaces and contrasting coloration of broken versus 
cortical surfaces (figure 1b, d, g of ref. 2). The closest approach to a 
notch (shown in figure Ic, d of Ferraro et al. ?) is a shallow, irregularly 
arcuate break—described as a pseudo-notch or micro-notch—that 
does not extend to the medullary portion of the bone, unlike the ‘nor- 
mal notch’ we illustrated', which was defined by two clear inflec- 
tion points, a negative flake scar, an attached cone flake and smoothly 
curved fracture surfaces that extend completely through the cortical 
portion of the bone. Only ‘normal notches’ are used to determine 
human agency”*. 

By overlooking the most important bone evidence, which includes 
impact features such as cone flakes, bulbs of percussion and a large 
impact notch with associated negative flake scar, as well as bone 
distribution patterns, bone refits and missing femoral diaphysis pieces, 


Ferraro et al.? did not consider precisely those features that are indi- 
vidually and collectively most likely to have been caused by cultural 
processes. They have not offered a cogent alternative site formation 
hypothesis that accounts for all evidence presented. 
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Wood made denser and stronger 


Animproved method for compressing wood substantially increases its strength and stiffness, opening up the possibility of 
applications in engineering for which natural wood is too weak. SEE LETTER P.224 


PETER FRATZL 


ood is among the oldest materials 
used by humans, and is still com- 
monly used for building". Its low 


density has also made it useful for transport 
applications such as shipbuilding, but this 
property is associated with a relatively low 
strength and stiffness. Scientists have tried 
to devise processes that make wood denser, 
to obtain materials suitable for high-strength 
applications, but with limited success. On 
page 224, Song et al.” describe a densification 
method that combines a chemical treatment 
with high-temperature compression, and 
which produces an unprecedented increase in 
stiffness and strength. 

The authors’ method starts by treating wood 
blocks with sodium hydroxide and sodium sul- 
fite, a chemical process similar to the method 
used to pulp wood to make paper. This chemi- 
cal treatment partially removes lignin and 
hemicelluloses (Fig. 1). Lignin is a biopoly- 
mer that has many functions in plants, such 
as stabilizing cell walls in wood and retarding 
attacks on wood cells by parasites and bacteria; 
hemicelluloses are sugar chains that cover and 
bind fibrils of cellulose in the cell walls. 

Song and colleagues then compress the 
blocks at temperatures of about 100°C. This 
removes most of the pores in the wood, and 


Natural wood 


Figure 1 | A process for densifying wood. Natural wood contains 
pores formed from the remains of parallel, tube-like cells, the walls of 
which contain cellulose, along with biopolymers known as lignin and 
hemicelluloses. Song et al.’ treated natural wood with a mixture of 
sodium hydroxide and sodium sulfite, which partly removed the lignin 


Chemical 
treatment 


increases its density from 0.43 grams per cubic 
centimetre to 1.3 gcm™. The resulting stable 
material is too dense to float on water, but the 
authors report that its stiffness and strength 
have both increased impressively, by a factor 
of about 11 compared with untreated wood. 
As the authors point out, previous attempts to 
densify wood also improved the strength, but 
by no more than a factor of about three to four’. 
The secret to Song and colleagues’ success lies 
in their combination of chemical treatment and 
high temperatures during pressing. 

Natural wood contains a multitude of paral- 
lel, tube-like cells, the walls of which constitute 
the major part of the material. In most parts of 
woody stems, the cells have died and left behind 
their cellulose-rich cell walls. These walls also 
contain lignin and hemicelluloses, and form 
hollow wood fibres. The tube-like fibres col- 
lapse laterally when compacted, effectively 
losing their hollow interiors. This increases 
the amount of material per cross-section of the 
stem, as evidenced by the increased density 
reported by Song and colleagues. On its own, 
this effect would be expected to cause the 
stiffness and strength of wood to increase in 
proportion to the increase in density’. 

However, the authors report that the 
stiffness increases by a factor of 11, whereas 
the density increases by a factor of only 3. A 
threefold density increase has been observed 


(Adapted from ref. 2.) 
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compression 


in previous work that used hot pressing alone 
(see ref. 3, for example). It therefore seems 
likely that the authors’ chemical treatment 
modifies and strengthens the cellulose-based 
composite that makes up cell walls in wood. 

Many cellulose-based materials swell 
undesirably when they come into contact with 
water, but Song and colleagues report that water 
swelling of their densified wood is tolerably 
small. It remains to be seen whether the partial 
removal of lignin from the material makes it 
susceptible to bacterial or fungal attacks. 

The densified wood is still lighter than 
metallic materials, so its stiffness and strength 
open up the potential for many engineering 
applications. This raises the question of why 
trees use a porous material for their trunks, 
when their goal in a forest is to be as high as 
possible, to ensure that their leaves are exposed 
to light — a task for which stiffer and stron- 
ger materials might intuitively seem better 
suited. By making wood porous, trees partially 
sacrifice the material's strength. One answer is 
that wood is multifunctional, and so the pores 
are needed for more than just structural tasks, 
such as to transport water and nutrients. 

But the optimal response of natural materi- 
als to a load varies according to the function 
involved, such that lower density can be more 
important than higher strength*”. In brief, 
the height of a slender column that supports a 


Hot 


Densified wood 


and hemicelluloses. They then compressed the wood at about 100°C, 
which caused the cells to collapse. The resulting material was about 

3 times as dense as natural wood, and about 11 times as stiff and strong — 
making it potentially useful for high-strength engineering applications. 


compressive load along its axis is often limited 
by the risk of buckling; for a given column width 
and compressive load, higher columns can 
be built by using materials that have a higher 
Young’s modulus (E, a measure of stiffness). 
When the height of the column is not limited 
by an external load, but by just its own weight, 
then greater heights can be attained using a less 
dense material: the aim in this context is to max- 
imize the ratio of E to the density p, rather than 
just E. And when the goal is to build the highest 
possible column using a fixed mass of material, 
then it is best to maximize E/p’. (Maintaining 
a constant mass is relevant to plants, because 
synthesizing material is a major cost for them; 
maximizing E/p” corresponds to the most eco- 
nomical way of growing the highest possible 
column at fixed material costs.) 

A consideration of these principles reveals 
that Song and colleagues’ densified wood should 
perform better than natural, porous wood in the 
first two scenarios (in which E or E/p need to 
be as large as possible), but only about equally 
well in the third situation, for which E/ p is 
maximized, on the basis of the changes in stiff- 
ness and density reported by the authors. This 
indicates that trees do not lose much by mak- 
ing wood porous, and that the introduction 
of pores for water transport comes at no extra 
material cost. Perhaps because of this, the height 
of trees is likely to be limited more by hydrau- 
lic constraints linked to water transport than 
by mechanical constraints*. Similarly, many 
advanced-engineering applications require 
materials that have high stiffness and strength, 
but in some cases porous materials would 
increase performance, rather than decrease it. 

All biological materials are active, and adapt 
their internal structure to their function and 
to environmental needs. Two strategies can be 
used to repurpose such materials for engineer- 
ing applications. One is to modify the material 
to comply with specifications in industrial 
design, as exemplified by Song et al. with their 
densification procedure. The other, perhaps 
more conventional, option is to adapt designs 
to the properties of natural materials. The 
latter approach is more sustainable, but would 
require greater knowledge of how structure 
relates to function in such materials, and the 


development of new design approaches”"*. = 


Peter Fratzl is in the Department of 
Biomaterials, Max Planck Institute of Colloids 
and Interfaces, Potsdam 14424, Germany. 
e-mail: fratzl@mpikg.mpg.de 


1. Ramage, M. H. et al. Sustain. Energ. Rev. 68, 
333-359 (2017). 

2. Song, J. et al. Nature 554, 224-228 (2018). 

3. Erickson, E. Mechanical Properties of Laminated 
Modified Wood (US Dept Agriculture, 1965). 

. Ashby, M. F. Metall. Trans. A 14, 1755-1769 (1983). 

. Ashby, M. F., Gibson, L. J.,Wegst, U. & Olive, R. 

Proc. R. Soc. Lond. A 450, 123-140 (1995). 

. Gibson, L. J., Ashby, M. F., Karam, G. N., Wegst, U. & 
Shercliff, H. R. Proc. R. Soc. Lond. A 450, 141-162 
(1995). 

7. Fratzl, P.& Weinkamer, R. Prog. Mater. Sci. 52, 


os 


a 


1263-1334 (2007). 

8. Niklas, K. J. Tree Physiol. 27, 433-440 (2006). 

9. Schaffner, W. in +U/tra Knowledge & Gestaltung (eds 
Doll, N., Bredekamp, H. & Schaffner, W.) 23-32 


CANCER RESEARCH 


NEWS & VIEWS | RESEARCH | 


(Seemann, 2017). 

10.Fratzl, P. in +Ultra Knowledge & Gestaltung (eds 
Doll, N., Bredekamp, H. & Schaffner, W.) 173-178 
(Seemann, 2017). 


Many mutations in one 
clinical-trial basket 


When abnormality in a gene is linked to cancer and a drug targets the encoded 
protein, how can the patients who will respond to the drug be identified if the gene 
is mutated in many different ways in many different cancers? SEE ARTICLE P.189 


ELAINE R. MARDIS 


ancer usually arises from genomic 

abnormalities. However, the number 

and complexity of genetic alterations 
in tumours can make it difficult to predict 
whether, and in which tissues, a particular 
mutation in a specific cancer-linked gene will 
drive tumour growth. This poses a challenge 
when trying to identify effective treatments. 
For example, if a drug that targets a specific 
protein can treat a person with breast cancer 
who has a mutation in the gene encoding the 
protein, could the drug treat another patient 
who has a different mutation in that gene? And 
could it treat a person with a mutation in the 
same gene, but in a tumour that has developed 
ina different tissue? On page 189, Hyman et al.' 
report the outcome ofa clinical trial testing the 
ability of the drug neratinib, which inhibits 
HER2 and HER3 tyrosine kinase enzymes, to 
reduce or eliminate tumours. The drug was 


Patients a3 
Types of cancer 10 
Patients responding 


tested on 21 types of cancer in 141 people who 
had a total of 42 different mutations affecting 
one of the enzymes. 

Studies in the 1970s revealed that certain 
chromosomal DNA aberrations can be linked 
to the development of specific cancer types, 
and that an amplification in the number of 
copies of particular genes can have a tumour- 
promoting effect”. For example, a highly 
lethal type of breast cancer is linked’ to ampli- 
fication of the gene ERBB2 and an increase in 
the level of the HER2 protein that it encodes. 
HER2 amplification occurs in several other 
cancers’, including colorectal adenocarci- 
noma and bladder cancer. This understand- 
ing led to efforts to develop treatments to stop 
the action of such overexpressed proteins, 
resulting in several HER2-targeted therapies 
that are used in the clinic’ to prolong survival 
in people whose cancers have amplification of 
ERBB2. Other links between ERBB2 abnor- 
malities and cancer have been identified; 
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Figure 1 | Results of a cancer clinical trial, Hyman et al.’ report the outcome of a study testing how 
effectively the drug neratinib can treat tumours. The tyrosine kinase enzymes HER2 and HER3 have 
been linked to tumour growth and can be inhibited by neratinib. The 141 patients tested had a range 

of mutations that altered HER2 or HER3, and, between them, had many different tumour types. The 
protein structures are shown, and arrows indicate the domains or interdomain locations at which protein 
alterations due to mutations were found. For the HER2 data shown, the cancers were grouped into ten 
cancer-type categories: biliary, bladder, breast, cervical, colorectal, endometrial, gastro-oesophageal, 
lung, ovarian or other (for all other cancer types). Responding patient numbers indicate those whose best 
overall response to the drug was a partial or complete response — a decrease or absence, respectively, of 


detectable cancer at the end of the trial. 
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for example, single-nucleotide mutations in 
ERBB2 have been found in breast cancers*® 
that do not have amplified ERBB2, and in 
lung adenocarcinomas’. 

The rapid development of therapeutics 
targeting specific cancer-associated proteins 
has coincided with the rise in DNA sequencing 
of tumours. In the past decade, the genomic 
alterations in tens of thousands of cancers 
have been characterized at single-nucleotide 
resolution. This has revealed that cancer- 
associated genes can be altered in myriad 
ways and that such alterations can be found 
in primary tumours that arise in many differ- 
ent tissues. However, such variability makes 
it hard to predict whether a specific drug will 
have an effect on a patient’s cancer; this, in 
turn, complicates the decision of who to enrol 
in a clinical trial. One approach to this prob- 
lem involves introducing the mutated genes in 
question into preclinical model systems such 
as genetically engineered mice or cell-line 
models, but these models are not practical for 
large-scale investigations of many different 
gene alterations in different tissue types. 

The design of clinical trials testing targeted 
therapeutics has changed substantially in the 
era of cancer genomics. Early-phase trials, in 
particular, now often include people who have 
an altered target gene, regardless of the tissue 
in which the tumour is present. These ‘basket’ 
trials seek to identify the combination of muta- 
tions and tissues that respond to treatment, 
offering the opportunity, if a trial progresses 
to a later stage, to focus on tumours in those 
tissues that are most likely to respond. 

The ability of neratinib to target tumours 
with ERBB2 mutations had been demon- 
strated® in human-tumour samples trans- 
planted into mice. Hyman et al. used a 
basket-trial approach to test the effects of the 
drug on many patients with known tumour- 
driving ERBB2 mutations; they also examined 
its effects on a small number of patients who 
had either rare ERBB2 mutations or mutations 
in ERBB3, the gene that encodes HER3 and 
that has also been linked to tumour growth*. 
An interesting feature of the trial design is 
that it included people with mutations that 
had not previously been tested for a response 
to the drug. Some tumour types studied by 
Hyman and colleagues were not represented 
in sufficient numbers for the team to assess 
whether treatment had had a statistically sig- 
nificant effect, and enrolment in the trial is 
continuing for specific tissues. 

The authors found that the effect of neratinib 
therapy varied in different mutational and tis- 
sue contexts. For example, some people who 
had breast, small-cell lung, cervical, biliary or 
salivary cancers, and who had certain ERBB2 
mutations, responded to the treatment; the 
greatest effect was observed for breast cancers 
containing amino-acid alterations in the 
extracellular or kinase domains of HER2 
(Fig. 1). Several patients with previously 


uncharacterized ERBB2 variants responded to 
neratinib, supporting the role of these muta- 
tions as tumour drivers. Neratinib had no 
effect on tumours with ERBB3 mutations, nor 
did it affect colorectal or bladder cancers that 
had ERBB2 mutations. The bladder-cancer 
result is consistent with previous studies””” in 
which HER2 targeting did not affect this type of 
cancer. Lack of response to neratinib provides 
circumstantial evidence that rare alterations in 
ERBB2are unlikely to be tumour drivers. 

Hyman and colleagues’ results indicate that 
preclinical model studies, such as those sug- 
gesting that ERBB3 can drive tumour growth’, 
can sometimes be misleading when trying to 
infer what happens in a human tumour. This 
might be because of how the overall genomic 
context influences the effect of a mutation. A 
tumour that has an altered target gene could 
also have alterations in other cancer-promot- 
ing genes. Another source of inconsistency 
between human and mouse studies might be 
the particular tissue context. 

Finally, the genomic heterogeneity of 
tumour cells (the presence of groups of cells in 
the tumour that contain different genetic alter- 
ations) might be important in determining 
treatment response. Sequencing analysis con- 
ducted by Hyman and colleagues for certain 
ERBB2 mutations demonstrated that most 
patients whose ERBB2 mutations were present 
in all the tumour cells responded to neratinib, 
whereas those with ERBB2 mutations in only 
a subset of the tumour cells did not respond. 

The authors noted that response to 
treatment could be affected by the particular 
genetic mutation, the location of the tumour 
and the specific pattern of other mutated 
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cancer-associated genes present. This will 
probably hold true for most, if not all, future 
basket trials of targeted inhibitor therapies 
and is quite instructive for such studies. 
More-complete genomic characterization of 
tumours, beyond the gene(s) being targeted, 
will be needed to determine the genomic 
context linked to response or resistance to 
treatment. The genomic profiles and thera- 
peutic-response data from basket trials such 
as this one should be made publicly available 
as a way of improving the design of clinical 
trials of other agents. Such data sets might 
contribute to the development of diagnostics 
that enable the precise identification of those 
patients who are most likely to benefit from 
targeted treatment. The data could also help 
to streamline the design of clinical trials and 
thereby hasten cancer therapeutics towards 
regulatory approval. m 
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Fossil-fuel subsidies 


assessed 


Many governments subsidize the production and consumption of fossil fuels. 
Contrary to expectation, a study finds that removing these subsidies would only 
modestly reduce global carbon dioxide emissions. SEE LETTER P.229 


IAN PARRY 


he 2015 Paris climate agreement was 

signed by 195 countries, with most 

pledging to reduce their emissions of 
carbon dioxide and other planet-warming 
gases. Many countries have a long history of 
subsidizing fossil fuels, and it seems logical that 
removing these subsidies — as the G20 group 
of nations has agreed to do — would help them 
to achieve their Paris climate commitments. 
However, on page 229, Jewell et al.' report 


a comprehensive and convincing analysis 
suggesting that reforming these subsidies 
would cause only a modest reduction in global 
CO, emissions. Nevertheless, I think that there 
is an urgent need for broader reform of fossil- 
fuel prices to fully reflect the costs associated 
with global warming and other environmental 
considerations. 

Subsidy reform would increase domestic 
fossil-fuel prices to match the production 
costs. Its impact on the climate would therefore 
depend on how energy demand is affected by 
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these higher fuel prices — for example, through 
people driving less, power generators switching 
to cleaner fuels such as those from renewable 
energy sources, and households and businesses 
adopting energy-saving technologies. Because 
these responses are inherently uncertain, Jewell 
et al. used five different models to assess the 
consequences of subsidy reform. These mod- 
els compared projections of fuel use and CO, 
emissions with and without subsidy reform by 
region or country, using diverse assumptions 
about future economic growth, technological 
trends, energy prices and so on. 

The authors found that removing all fossil- 
fuel subsidies would have a limited impact 
on global energy demand by 2030 (a reduc- 
tion of about 1-4%). In addition, the share of 
energy from renewable sources would rise by 
less than 2%, and global CO, emissions would 
fall by only 1-4% (under both low and high 
oil prices). Consequently, in most regions, the 
CO, reduction from subsidy reform would fall 
far short of what is needed to meet the Paris 
climate pledges (Fig. 1). The exceptions are 
regions such as Russia, the Middle East and 
North Africa, where subsidies are heavily 
concentrated and pledges are less ambitious. 

There are two main reasons for the generally 
modest impact of subsidy reform on CO, 
emissions. The first is that coal (the fossil 
fuel that emits by far the most CO, per unit 
of energy) currently receives little subsidy. 
Instead, 60% of subsidies are for oil, and the 
remainder is largely for natural gas and for 
the electricity generated from fuels (see Fig- 
ure 2a of the paper’). The second reason is that 
global subsidies have declined sharply, from 
US$570 billion in 2013 to $330 billion in 2015. 

However, I think that reform of fossil-fuel 
prices needs to go well beyond aligning them 
with production costs. Fuel prices should 
also reflect the consequences of their use for 
global warming and other environmental 
considerations, such as the costs of deaths 
resulting from air pollution and, in the case 
of road fuels, traffic congestion and accidents. 
Furthermore, prices for fuels purchased by 
households should include the general sales 
or value-added taxes that are applied to other 
consumer products. 

A study’ in 2017 estimated that if fossil-fuel 
subsidies had been defined more broadly to 
reflect undercharging for environmental costs 
and general taxes, as well as production costs, 
these subsidies would have totalled $5.3 tril- 
lion in 2015 (6.5% of global gross domestic 
product). Furthermore, the study suggested 
that if prices had fully accounted for produc- 
tion costs, global and domestic environmental 
impacts and general taxes in 2013, global CO, 
emissions would have been 21% lower than 
they were, air-pollution deaths associated with 
fossil fuels would have been 55% lower, and 
government revenues as a percentage of gross 
domestic product would have been 4% higher. 

Broader reform of fossil-fuel prices is 
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Figure 1 | Impact of fossil-fuel-subsidy 

reform. Fossil fuels are subsidized in many 
countries, and it was thought that removing these 
subsidies would lead to a substantial reduction in 
carbon dioxide emissions. However, Jewell et al.! 
report an analysis suggesting that the resulting 
change in CO, emissions by 2030 would be modest. 
The exceptions are regions in which current 
subsidies are heavily concentrated (shown in red), 
such as the Middle East and North Africa (MENA). 
The bars denote the range of emission changes 
predicted (under low oil prices), and asterisks 
indicate regions that constitute more than the 
designated country. (Adapted from Fig. 3b of ref. 1.) 


therefore urgent for both developed and 
developing countries. However, such a reform 
must be carefully crafted to enhance the pros- 
pects for success. A comprehensive plan should 
be developed, in consultation with stakehold- 
ers, that has clear goals and timetables. It 
should specify the taxes to be cut or the public 
investment programmes to be expanded, using 


BIOMECHANICS 


revenue raised by fuel-price reform. In addition, 
there should be measures to compensate low- 
income households for the effects of higher 
energy prices and to help workers who might 
lose their jobs in energy-intensive industries. 

Researchers and international organiza- 
tions (such as the International Monetary 
Fund, World Bank and the Organisation for 
Economic Co-operation and Development) 
have an important role in providing informa- 
tion and guidance to help policymakers drive 
forward subsidy reform and communicate the 
case for reform to the public. The information 
required includes the fossil-fuel prices that 
countries should adopt, both to meet their 
Paris climate pledges and to reflect the broader 
environmental costs. 

But it also includes the effect of reform on 
energy systems, the economy, fiscal balances 
and vulnerable groups, and the trade-offs 
between higher fuel prices and other policy 
approaches, such as requirements for energy 
efficiency and renewable fuels. Analysis of 
ongoing reform experiences in different coun- 
tries could also help governments to navigate 
around the political obstacles. 

Rigorous studies, such as that by Jewell and 
colleagues, are essential. But there is a need 
to focus these studies on the broader reform 
issues discussed here, for which the stakes are 
especially high. = 
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Evolutionary race as 
predators hunt prey 


Remote-sensing data for wild animals such as lions reveal that 
predators and prey optimize manoeuvrability rather than speed during 


the hunt. SEE ARTICLE P.183 


ANDREW A. BIEWENER 


r | Vhe survival of predators and prey 
depends on their respective abilities 
to successfully chase food and escape 

capture, thereby exerting strong selective pres- 

sure on their running ability and behavioural 
strategies. Perhaps nowhere on Earth does this 
play out more dramatically than on the African 
savannah, where the fastest terrestrial predators 
chase their fleet-footed prey. Yet direct measures 
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of the key factors driving this type of hunt 
performance in the wild are difficult to obtain. 
On page 183, Wilson et al.' report findings 
from their use of data-capturing collars to track 
the movement dynamics of wild animals in 
Botswana during hunts. The authors also con- 
ducted computer modelling of predator-prey 
interactions and carried out laboratory tests to 
assess the properties of the animals’ muscles. 
In recent years, the ability to use remote- 
sensing devices under natural field 


wows 
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Figure 1 | A lioness hunting a zebra in Etosha National Park, Namibia. Wilson et al.' report their analysis of the movement dynamics of predator-prey hunts 
in the wild in Africa using data gathered remotely from Global Positioning System sensing collars placed on lions, zebras, cheetahs and impala. 


conditions and over long time frames 
has led many to study animals’ migra- 
tory’*, foraging’ and collective-movement 
behaviour”*, which has provided fascinating 
insights into biomechanics, physiology and 
decision-making. Wilson and colleagues tooka 
remote-sensing approach to study lions preying 
on zebras, and cheetahs preying on impala, in 
the wild. The authors temporarily immobilized 
animals and fitted them with lightweight collars 
containing technically sophisticated, custom- 
designed, miniature electronic and Global 
Positioning System (GPS) devices. The devices 
monitored the animals’ location, movement 
direction and acceleration patterns. Wilson 
et al. tracked 9 lions, 5 cheetahs, 7 zebras and 
7 impala, and recorded 2,726 high-speed runs 
for lions, 520 for cheetahs, 1,801 for zebras and 
515 for impala. This remarkable data set logs 
individual animal strides and provides informa- 
tion about the speed, acceleration and turning 
performance of these predator-prey pairs. 
The animals were not observed directly, 
and one limitation of the recorded data is 
that few, if any, of the movement tracks rep- 
resented hunts between pairs of predator and 
prey, with both animals recorded as one hunts 
the other. Therefore, the hunting strategies of 
predator and prey must be inferred from the 
collar-recorded data, making the assumption 
that the movement patterns represent actual 
hunts. However, the locomotor performance 


recorded by the remote-sensing collars and the 
hunting strategies that could be inferred from 
these measurements are consistent with behav- 
ioural observations made by others’. Moreover, 
analysis of the full data set revealed that preda- 
tors and prey exhibited manoeuvrability near 
the limits of their capability. Hence, although 
recordings of one-on-one hunts are lacking, 
the data were consistent with maximal pred- 
ator-pursuit and prey-evasion performance, 
enabling the authors to model hunt outcomes. 

After collar placement, a tiny biopsy of 
hindlimb muscle was taken from the animals 
for subsequent state-of-the-art laboratory test- 
ing of single-muscle-fibre contractility. This 
revealed that, compared with the muscle fibres 
sampled from the prey species, the predator 
muscle fibres deliver more power for a given 
muscle mass when they contract, allowing 
the predators to run faster and accelerate and 
decelerate more quickly than their prey. With 
more-powerful muscles than their prey and 
claws to grip the ground effectively, predators 
are better at accelerating into a turn (centrip- 
etal acceleration) than their prey are. 

Wilson and colleagues’ acceleration and 
GPS recordings indicated that, during inferred 
hunts, the predators and prey regularly 
achieved their maximal turning performance 
but ran at speeds well below their athletic 
capabilities. Running at speeds slower than 
maximum capacity during a pursuit enhances 


manoeuvrability, which improves the prey’s 
probability of successful escape and enables 
predators to better track their prey’s move- 
ments, thereby increasing the number of 
successful hunts. 

Using their field-recorded locomotion data, 
Wilson and colleagues modelled predator and 
prey capture-evasion tactics to examine how 
different performance metrics, such as speed, 
separation distance between the animals, 
deceleration, acceleration and turning rate, 
would affect the outcome of a hunt. Evasion 
modelling showed that prey escape was more 
likely if a prey animal relied on turning more 
sharply and at a greater rate than its pursuer. 
This type of behaviour increases the unpre- 
dictability of the prey’s movement trajectory, 
as has also been observed for bipedal desert 
rodents fleeing a predator®. Wilson et al. noted 
that, during the predators’ approach (Fig. 1), 
they exhibited greater deceleration and 
acceleration than that of the prey, allowing the 
predators to close in on and better track the 
prey’s lateral movements. The close match of 
athletic performance between predators and 
prey highlights the strong selection pressure 
that has resulted in an evolutionary ‘arms 
race’ for improved locomotion ability in large 
carnivores and their large herbivorous prey. 

The increasing use of remote-sensing 
technologies in animal studies is enabling 
the monitoring of factors such as animal 
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acceleration, pressure (for example, during 
flight or when swimming at depth) and tem- 
perature. Such work promises to illuminate 
not only predator-prey interactions, but also 
how wild animals cope with other real-world 
issues”. For example, this type of research 
could enhance our understanding of how ani- 
mals are dealing with the impacts of climate 
change, or offer insight into the factors gov- 
erning behaviours such as habitat selection, 
mating and foraging. Moreover, understand- 
ing how animals move might inspire the 
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design of robots that can negotiate complex 
environments. m 
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A chirp, a roar and 


a whisper 


In 2017, gravitational waves and electromagnetic radiation were detected from 
the merger of two stellar remnants called neutron stars. An observational analysis 
reveals how this radiation was released from the merger. SEE LETTER P.207 


RALPH WIJERS 


ast year, scientists reported the 
L coalescence of two astronomical objects 

known as neutron stars’. The event, 
called GW170817, produced gravitational 
waves, which had weakened to a faint ‘chirp’ by 
the time they reached us. In addition, some of 
the matter in the neutron stars was ejected into 
space. Moments later, this matter was hit by a 
powerful jet of material from the merged stars, 
resulting in a roaring outburst of radiation at 
all wavelengths’. However, despite a flood of 
data, the process by which this radiation was 
generated has not been certain. On page 207, 


Neutron star 


Mooley et al.* report that GW170817 still 
whispers to us in radio waves. These signals 
suggest that the observed radiation came from 
a relatively slow-moving ‘cocoon’ of matter that 
was energized by the jet. 

The 1993 and 2017 physics Nobel prizes 
were awarded for the indirect** and direct® 
detection of gravitational waves, respectively. 
These studies concerned systems that can be 
well described using only Einstein’s theory of 
general relativity. But astrophysics is rarely so 
simple. For instance, when two neutron stars 
merge, they produce fireworks — they deform, 
splash, explode and radiate. Consequently, all 
the complexities of fluid dynamics, magnetic 


Radiation 


Cocoon 


Ejected 
matter 


Figure 1 | Radiation from a neutron-star merger. A pair of stellar remnants called neutron stars can 
orbit each other, gradually getting closer, before eventually merging. In 2017, electromagnetic radiation 
was detected from a neutron-star merger’. Mooley et al.’ report evidence for a model that explains how 
this radiation was generated. In the model, some of the matter in the neutron stars is ejected. This matter 
is then energized by a powerful jet of material from the merged stars, creating a relatively slow-moving 
‘cocoon of matter. The cocoon then emits the observed radiation. 
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fields, nuclear reactions, particle acceleration 
and radiation come into play. Astronomers 
cannot create and tune experiments, but must 
make do with the messy ones performed 
by nature. 

What astronomers can do, however, is take 
advantage of two of the biggest revolutions in 
the field since the invention of the telescope. 
First, in the twentieth century, astronomy 
became multi-wavelength: we can now detect 
radiation across the electromagnetic spec- 
trum (from radio waves to y-rays). Second, in 
this century, it became multi-messenger: we 
can now detect a broad range of emissions — 
from high-energy cosmic rays and neutrinos 
to gravitational waves. The discovery of 
GW 170817 demonstrated the full potential of 
these advances for the first time. 

After being alerted to the gravitational-wave 
signal, astronomers used just about every type 
of telescope available to try to view the event. 
Asa result, a wide variety of data was obtained, 
potentially providing enough information to 
pin down a complete picture of what physically 
happened when the neutron stars merged. In 
particular, NASA's Fermi Gamma-ray Space 
Telescope detected a flash of y-rays that had 
formed within two seconds of the merger’. The 
properties of the flash were consistent with a 
y-ray burst (a cosmic explosion long thought 
to be related to neutron-star mergers), which 
immediately increased interest in GW170817. 
However, the exact cause of the y-ray emission 
became a matter of debate. 

Standard y-ray bursts can be produced only 
by a jet — an outflow of material moving at a 
speed at least 99.9% that of light. But the burst 
from GW170817 was about 10,000 times 
weaker than these bursts and seen only because 
it occurred relatively close to us’. Such a weak 
burst could have come from an off-axis jet (one 
that was aimed away from us), which would 
allow only the tiny fraction of light that it emit- 
ted sideways to be observed. But it could also 
have been produced by a comparatively slow- 
moving cocoon of matter, perhaps travelling at 
‘only’ 95% of the speed of light (Fig. 1). 

The initial papers”® concluded that both 
scenarios are possible, and that additional data 
should allow us to identify which one is cor- 
rect. Mooley etal. now fulfil this promise. They 
show that although the outburst of radiation 


from GW170817 is dying down, the intensity 
of its radio emission is rising — a finding that 
is difficult to explain using the relatively simple 
jet models that they consider. The presence of 
an off-axis jet that breaks free of the surround- 
ing material is not completely excluded, but 
the cocoon model is more consistent with the 
observational data. 

Establishing the origin of the electro- 
magnetic emission from GW170817 is key to 
gaining a detailed understanding of the rela- 
tionship between gravitational-wave events, 
neutron-star mergers and y-ray bursts. Ifa 
consideration of a greater number of jet and 
cocoon models than that of Mooley et al., and 
high-quality simulations of these models, 
support the authors’ conclusions, nature will 
once again have shown us that the range of 
phenomena possible is wider than our sim- 
plest thinking suggested. If the cocoon model 
is correct, this probably implies that many 
more gravitational-wave events have associ- 
ated y-ray bursts than was previously thought. 

However, Mooley and colleagues’ explana- 
tion for the burst does not affect our basic 
understanding of what happens in a neutron- 
star merger — all of the models considered 
by the authors have great commonality. 
For instance, the merged stars are always 
surrounded by matter that is both inflowing 
(in the equatorial plane of the merged stars) 
and outflowing (in all other directions). And 
a faster and narrower outflow is always driven 
into this matter along the rotational axis of the 
merged stars. 

The differences are in the precise outcome of 
the attendant fluid dynamics. How much mass 
is contained in slow parts of the surrounding 
matter, and how fast and narrow is the jet? 
Does the jet break out of the surrounding 
matter so that it can be seen by us, or does it 
share its energy with this material, producing 
a relatively slow and broad explosion? 

The next few years will see fierce efforts to 
address these questions. Gravitational-wave 
events will be under surveillance by an army 
of telescopes, to find or exclude a fast jet. And 
elaborate computer simulations will be used 
to try to determine what we should expect 
to happen in neutron-star mergers froma 
theoretical standpoint. The next handful of 
well-observed events will bring us much closer 
to the answers. = 
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Solitons divide 
and conquer 


An experimental technique allows packets of light called solitons to maintain 
their shape in all three dimensions as they travel through a material. Such wave 
packets could find applications in optical information processing. 


FRANK W. WISE 


aves spread out as they propagate. A 
familiar example is the broadening 
of a beam of light. The challenge 
of overcoming the ubiquitous spreading of 
waves has motivated scientists for decades, and 
packets of light waves that retain their shape, 
known as solitons, have been demonstrated in 
one and two dimensions’. However, it has been 
extremely difficult to create solitons that are 
stable in three dimensions. Writing in Physical 
Review X, Lahav et al.” report an experimen- 
tal approach that can produce such objects. 
The work will allow fundamental properties 
of 3D solitons to be investigated, and could 
lead to 3D solitons that have technological 
applications. 
A narrow beam of light contains wave 


Material 


components that propagate in different 
directions. As the beam travels through a 
material, these wave components get out of 
sync, causing the beam to spread out — a pro- 
cess known as diffraction. However, ifthe beam 
is powerful enough, the light changes the mater- 
ial’s refractive index (a quantity that describes 
how light propagates in a medium), which in 
turn affects the beam. In particular, ifthe beam 
has a bell-shaped intensity profile, as do most 
laser beams, the material focuses the beam like 
alens. By tuning the beam intensity, this focus- 
ing can counteract diffraction to produce a 
‘self-guided’ beam that does not spread out. 

In addition to diffraction, a pulsed beam 
exhibits a broadening effect along its direc- 
tion of propagation. Each pulse of light con- 
tains wave components that have a range of 
frequencies (colours), and, as a pulse moves 
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Figure 1 | Light pulses versus 3D solitons. a, A pulse of light tends to spread out as it propagates through 
a material (coloured arrows). It broadens along the direction in which it is travelling (the x axis) as a result 
of dispersion, whereby components of the pulse that have different frequencies separate. Furthermore, it 
widens along the perpendicular directions (the y and z axes) because of diffraction. b, Lahav et al.’ report 
a technique for producing three-dimensional solitons — packets of light that maintain their geometry as 


they move through a material. 
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through a material, these components separate 
—a process called dispersion. There are two 
types of dispersion: normal and anomalous. In 
normal dispersion, the low frequencies move 
faster than the high frequencies (‘red leads 
blue’), whereas in anomalous dispersion, the 
high frequencies lead the low frequencies 
(‘blue leads red’). 

However, a high-power beam can cause the 
change in the material’s refractive index to shift 
the lower frequencies (‘red’) to the front of the 
pulse and the higher frequencies (‘blue’) to the 
rear. By tuning the beam intensity, the effect 
of anomalous dispersion can be cancelled out. 
Furthermore, if a bright beam is turned off 
and then back on (a dark pulse), the frequency 
shifts are reversed and normal dispersion can 
be neutralized. 

A 3D soliton, sometimes referred to as 
a light bullet, is the result of cancelling out 
diffraction and dispersion simultaneously 
(Fig. 1). Although such objects exist in theory, 
they are notoriously unstable. The focusing of 
the beam by the material must perfectly bal- 
ance diffraction, and it is extremely difficult 
to counteract diffraction and dispersion at the 
same time, because these actions require dif- 
ferent beam intensities. Scientists have gener- 
ated solitons that are stable in two dimensions 
(one along the direction of propagation and 
one perpendicular to this direction)’, and 3D 
solitons in a highly structured material (glass 
patterned with an array of optical devices 
called waveguides)’. But it has not been pos- 
sible to create 3D solitons in an unstructured 
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material — which is desirable for studying 
these objects and for practical applications. 

It has been known for more than 20 years 
that self-guided light beams can be gener- 
ated in photorefractive materials’. These are 
materials that exhibit a temporary change in 
refractive index when exposed to a beam of 
light, as a result of electrons moving through 
them. The focusing of the beam occurs in 
such a way that the need for perfect control 
of the beam intensity is eliminated. Fur- 
thermore, bound electrons produce the fre- 
quency-shifting refractive index required to 
eliminate dispersion. There is only one hitch: 
the material needs to have regions of negative 
and positive electric charge, but such a charge 
distribution takes time to establish — longer 
than the duration ofa short light pulse. 

Lahav and colleagues’ solution was to shine a 
repetitive string of such pulses into a crystal of 
the photorefractive material strontium barium 
niobate, which responded to the power aver- 
aged over many pulses to create a self-guided 
beam. The response of the bound electrons 
in the crystal then allowed dispersion to be 
cancelled out in each pulse. The result was a 
string of 3D ‘pulse-traim solitons — so named 
because the properties of each soliton depend 
on the solitons that come before it. 

The authors used pulses of 800-nanometre 
wavelength, which meant that the crystal had 
normal dispersion. As a consequence, although 
the beam produced was bright, its temporal 
profile consisted of a dark pulse (see Figure 2 of 
the paper’). By performing similar experiments 


Smoking gun for a rare 
mutation mechanism 


In 1953, James Watson and Francis Crick proposed that rarely formed isomers 
of DNA bases cause spontaneous mutations to occur during the copying of DNA. 
Sixty-five years later, it looks as though they were right. SEE ARTICLE P.195 


MYRON F. GOODMAN 


ow do mutations arise when DNA is 
H copied in cells? On page 195, Kimsey 
et al.' combine observations of DNA 
structure with measurements of enzyme kinet- 
ics and computational modelling to provide a 
definitive explanation of a seminal mechanism. 
The elucidation of the structure of DNA 
reported in James Watson and Francis Crick’s 
classic 1953 paper” is a monumental piece of 
work. The key finding was that DNA has a 
double-helix structure held together by spe- 
cific interactions between pairs of bases, now 
known as Watson-Crick pairs: adenine (A) 


pairs up with thymine (T), whereas guanine 
(G) pairs with cytosine (C). The DNA bases 
exist as tautomers (readily interconvertible 
pairs of isomers), and the A-T and G-C base 
pairs contain each base in its predominant tau- 
tomeric form (Fig. 1a). The process by which 
DNA is assembled was not known at the time, 
but the discovery of base-pairing suggested 
that the sequence of nucleotides on one strand 
of a double helix could govern the sequence 
that was constructed on the complementary 
strand’. 

The DNA structure had other far-reaching 
implications: it suggested a model for how 
mutations might arise spontaneously as DNA 
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at longer wavelengths, for which the crystal has 
anomalous dispersion, it should be possible to 
generate bright 3D solitons — one of the major 
goals in the field of nonlinear optics. 

Considering the difficulty in controlling 
localized 3D wave packets, Lahav and col- 
leagues’ results constitute a substantial advance. 
Interest in localized but non-spreading 3D 
wave packets extends well beyond optics, to 
areas as disparate as exotic states of matter 
known as Bose-Einstein condensates® and 
excitations of substances called ferromagnetic 
colloids’. Furthermore, it should now be 
possible to investigate how 3D solitons interact 
when they collide — do they pass right through 
each other, interact or merge? This informa- 
tion might be useful some day for optical 
information processing®. m 
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is made. Watson and Crick proposed’ that 
mutations could occur because of “a base 
occurring very occasionally in one of the less 
likely tautomeric forms, at the moment when 
the complementary chain is being formed”. 
In other words, G-T and A-C mispairs could 
occur if one of the bases is in a disfavoured tau- 
tomeric form (Fig. 1b). Such mutations would 
be easily accommodated because tautomeric 
mispairs do not distort the helical DNA struc- 
ture. The disfavoured-tautomer model for 
spontaneous mutation formation (mutagen- 
esis) was rapidly adopted by biologists and 
included in textbooks, despite the absence of 
supporting experimental evidence. 
Mispaired structures other than those 
associated with tautomerization were dis- 
covered in the mid-1960s; these included 
the wobble pairs* proposed by Crick, and 
Hoogsteen pairs’. In the mid-1980s, mispairs 
associated with charged forms of DNA bases 
were also identified®*. But it wasn’t until 2011 
that a C-A mismatch associated with a rare 
tautomer was finally observed in an X-ray 
crystal structure’. The mismatch was formed 
between bases of nucleotides bound in the 
active site of DNA polymerase (the enzyme 
that synthesizes DNA from nucleotides) when 
DNA synthesis was performed in the presence 
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Figure 1 | Base-pair structures in DNA. a, The double-helix structure of DNA is held together 

by specific interactions (dotted lines) between pairs of bases: adenine (A) pairs with thymine (T), 

and guanine (G) pairs with cytosine (C). b, The DNA bases form rare isomeric structures known as 
tautomers, which can allow the formation of mispairs; bonds shown in blue are the tautomeric forms of 
the bonds shown in red ina. Kimsey et al.’ have detected tautomeric G-T mispairs in DNA duplexes, and 
conclude from modelling studies that this explains the frequency with which G-T is misincorporated into 
DNA during DNA duplication by polymerase enzymes — as proposed’ by Watson and Crick in 1953. 


of manganese(11) ions, which are known to 
cause mutations. A second X-ray structure’? 
reported that year identified an ionized G-T 
mismatch, also formed between substrates 
bound by DNA polymerase. In both cases, the 
mismatched pairs had the same geometry as 
Watson-Crick pairs. 

In Watson and Crick’s model for mutagenesis, 
the rare occurrence of disfavoured tauto- 
meric bases could account for the observed 
frequency with which DNA polymerases 
produce mismatches (about one per thousand 
to one per million base pairs formed"'). But 
such tautomers and the associated base pairs 
were thought to be almost impossible to detect 
in duplexes. Then, in 2015, 62 years after the 
mutagenesis model was proposed, researchers 
from the same group as Kimsey et al. reported 
a tour de force of experimental work: they used 
nuclear magnetic resonance (NMR) spectros- 
copy to identify’? a long-lived wobble G-T 
structure that was in a dynamic equilibrium 
with transient, rarely formed G-T mispairs 
associated with disfavoured tautomers, and 
with ionized G-T™ structures, both of which 
have Watson-Crick geometry. 

The first step of the DNA-synthesis process 
that forms a G-T pair is the binding of dGTP (a 
G-containing nucleotide) in the polymerase’s 
active site. This is followed by the enzyme’s 
catalytic step, in which the DNA is elongated 
through incorporation of a new G-T base pair. 


Once dGTP is bound in the active site, the 
base pair formed between dGTP and T on the 
complementary strand assumes a distorted 
wobble conformation, but seemingly cannot 
make the conformational transition needed for 
the catalytic step”. 

Kimsey and colleagues’ current study 
goes straight to the heart of the mutagenesis 
model by integrating structural analysis of 
G.T base pairs in duplexes with measure- 
ments of the kinetics of DNA polymerase 
reactions and computer modelling to show 
that tautomerism does indeed account for 
the misincorporation of base pairs. To ensure 
efficient catalysis, DNA polymerases require 
optimal geometrical alignment of nucleotide 
substrates with amino-acid residues in their 
active site’*"*. Such alignment can occur when 
G-T adopts one of its Watson-Crick-like 
structures (one of the disfavoured tautomeric 
forms, or the ionized structure’’). Kimsey et al. 
deduced from their studies that, at neutral pH, 
at least 99% of G-T misincorporation is attrib- 
utable to the formation of G-T tautomers — 
rather than of the ionized structure — from 
an initially bound G-T wobble pair. 

By successfully identifying a role for the 
disfavoured tautomeric forms of G-T in 
base-pair misincorporation, Kimsey and 
colleagues have solved half of the mystery 
of spontaneous mutagenesis. A solution for 
the other half now requires the disfavoured 
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tautomeric forms of C-A to be characterized 
in duplexes and correlated with the rate of 
C-A misincorporation. So far, NMR and X- ray 
data have identified only charged C-A* wobble 
structures ina DNA duplex”®. 

A related challenge would be to establish 
the mechanism by which 2-aminopurine, 
a base analogous to both adenine and 
guanine, induces mutagenesis. For example, 
2-aminopurine is a potent mutagen of the 
virus bacteriophage T4, for which it increases 
the frequency of A-T to G-C mutations (and 
of the reverse G-C to A-T mutations) to 
10-50 times the frequency of spontaneous 
mutation levels’*. If2-aminopurine was found 
to undergo a tautomeric shift much more fre- 
quently than A, it would implicate tautomeri- 
zation in the mechanism, and thus provide 
the icing on the cake for the tautomerization 
model of mutagenesis. m 
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CORRECTION 

The News & Views ‘Strategy for making safer 
opioids bolstered’ by Susruta Majumdar 
and Lakshmi A. Devi (Nature 553, 286-288; 
2018) incorrectly stated that more than 
100,000 adults suffer from chronic pain in 
the United States. The correct figure is more 
than 100 million adults. 
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Biomechanics of predator-prey arms 
race in lion, zebra, cheetah and impala 


Alan M. Wilson!, Tatjana Y. Hubel', Simon D. Wilshin!, John C. Lowe!, Maja Lorenc', Oliver P. Dewhirst', 
Hattie L. A. Bartlam-Brooks!, Rebecca Diack!, Emily Bennitt?, Krystyna A. Golabek*, Roger C. Woledge!t, 


J. Weldon McNutt’, Nancy A. Curtin! & Timothy G. West! 


The fastest and most manoeuvrable terrestrial animals are found in savannah habitats, where predators chase and capture 
running prey. Hunt outcome and success rate are critical to survival, so both predator and prey should evolve to be faster 
and/or more manoeuvrable. Here we compare locomotor characteristics in two pursuit predator-prey pairs, lion-zebra 
and cheetah-impala, in their natural savannah habitat in Botswana. We show that although cheetahs and impalas were 
universally more athletic than lions and zebras in terms of speed, acceleration and turning, within each predator-prey 
pair, the predators had 20% higher muscle fibre power than prey, 37% greater acceleration and 72% greater deceleration 
capacity than their prey. We simulated hunt dynamics with these data and showed that hunts at lower speeds enable prey 
to use their maximum manoeuvring capacity and favour prey survival, and that the predator needs to be more athletic 


than its prey to sustain a viable success rate. 


Ina chase, the prey animal can select its speed and the timing of accel- 
eration, deceleration and turns, whereas a predator in pursuit must 
predict or respond to the trajectory of its prey to enable interception 
and capture’ ?, The prey should make its movements unpredictable to 
the predator while generally using tactics that minimize the chance the 
predator has of catching it*. Therefore, although the optimum avoid- 
ance strategy might be, for instance, to perform a maximum-rate turn 
away from the predator, using this strategy consistently would enable 
the predator to anticipate that manoeuvre’. If a dominant evolutionary 
pressure on the locomotor system is predation success or evasion, 
then predator and associated prey should display similar high levels of 
athleticism®° distinguished by the specific adaptations necessary to 
enable capture (predators) or evade capture (prey)'®. We hypothesize 
that predators are consistently more athletic than their prey so that 
they can manoeuvre and change speed to respond to the unpredictable 
tactics of the prey animal. 

We studied two predator-prey pairs found on the southern African 
savannah, where a simple high-speed manoeuvring pursuit in open 
terrain is a commonly used hunting technique: cheetah, Acinonyx 
jubatus, and impala, Aepyceros melampus"', which are similar in 
size (50-70 kg compared to 50-60kg, respectively; Methods), and 
the substantially larger lion, Panthera leo, and zebra, Equus quagga’ 
(120-240 kg compared to 320 kg) (Fig. 1a-d, f). 

We evaluated five metrics. First, locomotor muscle maximum power 
output and contraction speed, which is assumed to be critical for speed, 
acceleration and turning performance!?"4, Second, animal acceleration 
and deceleration (in the direction of travel). This combines muscle 
power and volume with factors that include grip, body shape!* and 
the anatomical arrangement of muscles'®. Third, the highest speed 
commonly used by each species and the actual top speed recorded’. 
Fourth, animal turning performance (centripetal or lateral acceleration 
and heading rate), which can be limited by grip”, leg strength’® and 
muscle power’?”®. Finally, stride frequency, because there is only one 
opportunity per stride for the legs to apply impulses to change speed 
and direction in a hunt?!. 


Locomotion data were collected from free-living wild animals under- 
taking high-speed runs in northern Botswana using our own design 
of Global Positioning System and Inertial Measurement Unit collars? 
(GPS-IMU; Methods, Fig. 1 and Extended Data Fig. 1a). We collected 
velocity and acceleration data from 23,871 strides from 520 runs of 
five cheetahs, 22,491 strides from 515 runs of seven impalas, 111,110 
strides from 2,726 runs of nine lions and 64,952 strides from 1,801 
runs of seven zebras (Extended Data Fig. 1f). Muscle biopsies were 
collected from the biceps femoris, a major propulsive muscle in the 
hind leg (Methods). 


Muscle fibre power in predator and prey 

Muscle biopsies were skinned, placed in a trehalose—glycerol mixture, 
frozen in liquid nitrogen in the field and transported to the United 
Kingdom. Peak power, velocity and stress at peak power and maximum 
isometric stress were determined at 25°C for single, skinned fibres 
(Fig. 2a-f). Maximum power and associated velocity and stress were 
then calculated (Methods). 

Complete measurements were made on 37 individual skinned fibres 
from six cheetahs, 30 fibres from five impalas, 50 fibres from eight 
lions and 57 fibres from eight zebras. There was a distinct subpopula- 
tion of ‘low-performance’ fibres (twelve fibres from zebra, eight fibres 
from lions, three fibres from cheetahs and three fibres from impalas; 
Fig. 2d-f and Supplementary Data) with a velocity at peak power 
that was below 1.35 lengths s~! and a lower peak power (Fig. 2c and 
Extended Data Fig. 2g), which were either myosin heavy chain (MHC) 
type-I (11 of 19 fibres tested) or type-II (8 fibres) (Methods). 

Linear mixed-effects models were fitted for peak power, velocity and 
stress at peak power and isometric stress with a factor distinguishing 
predator and prey, including the interaction of this factor with a 
categorical variable called ‘fibre performance classification. Within the 
factor distinguishing predator and prey, we nested a random effect by 
subject and fibre. The residuals of this model exhibited heteroscedasticity 
and so the variance of the error term was allowed to vary by subject 
and performance classification. Power in the high-performance fibres 
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Predator-prey pairs 
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Figure 1 | Overview. a—d, The four species in the study wearing the 
collars; the release mechanism is shown in d. e, Flowchart summarizing 
study. f, Relative animal size and biopsy location (black cross on animal) 


was 20% greater in the predator group than in the prey group (effect 
size = 20.0 W kg}, z= —3.46, P=0.001). The difference was similar 
in both pairings. However, the effect was significant only in the lion- 
zebra pair (effect size =20.0 W kg~!, 20%, z=2.56, P=0.039), but not 
in the cheetah-impala pair (effect size = 18.9 W kg}, 19%, z=2.04, 
P=0.15). The peak-specific powers were very similar in the two 
predator species, and lower but very similar in the two prey species 
(mean power of high-performing fibres + standard error in W kg”! 

cheetahs, 106.7 + 4.6; lions, 108.1 + 4.2; impalas, 88.3 + 5.2; zebras, 
88.4+4.1). 

No significant differences between predators and prey were detected 
for the velocity at peak power (effect size 0.096 lengths s~!, z= —1.77, 
P=0.15) or stress at peak power (effect size = 7.2 kPa, z= —2.05, 
P=0.075) for high-performance fibres. Isometric stress was higher in 
predators (effect size = 33.4kPa, z= —2.87, P=0.008; Extended Data 
Fig. 2). 

The values reported here are comparable to data for skinned fibres 
from wild rabbits at 25°C’, but are high compared to published 
values for skinned fibres from large animals****. Muscle power is 
highly temperature dependent” and a temperature coefficient (Qyo; 
the ratiometric increase in rate with a temperature increase of 10°C) 
of 2.3 is appropriate”°, which predicts in vivo muscle power (all fibres, 
Extended Data Fig. 2i) of 232 (prey) and 292 (predators) W kg“! at a 
body temperature of 38°C. 

Slower myosin types and muscle fibres are inherently more 
economical?*.>”’, thus slower fibres confer advantages”®, and the fast 
versus slow distribution of fibres reflects the opposing pressures of 
predation (avoidance) on one side and food and water supply, ranging 
distance and environmental conditions on the other?>”®. This may 
partly explain why prey species have lower-power muscle fibres”° 
Therefore, muscles of desert specialists at risk of dehydration and/or 
starvation”, such as camels, vicunas and Arabian oryx, would be 
predicted to be biased towards economy”. Selection pressure for 
greater performance or economy could change fibre type distributions 
or muscle characteristics within a few generations—much more rapid 
than for changes in myosin contractile speed. 


Speed and acceleration of predators and prey 

Stride timing and therefore frequency was derived from collar accele- 
ration data’. Stride speed and accelerations were averaged over each 
stride; change in speed is calculated as the difference in speed between 
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along with collar GPS position data for the four species showing range 
overlap. F, female; M, male. 
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Figure 2 | Muscle contraction mechanics. a, Peak power compared to 
volume. b, Maximum isometric stress compared to cross sectional area. 
c, Peak power versus velocity at peak power, each point is a muscle fibre. 
d-f, Box plots show variations in power (d), isometric stress (e) and 
optimal shortening velocity (f) across the four species, with each fibre 
represented as a dot and the low-performance fibres as a plus symbol. 
Data from each individual are shown in a separate vertical column. Line 
indicates the median, box shows the interquartile range (IQR) and the 
whiskers are 1.5 IQR. Data are from 37 fibres from six cheetahs (C), 30 
fibres from five impalas (I), 50 fibres from eight lions (L) and 57 fibres 
from eight zebras (Z). 


© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


ARTICLE 


a b c d e 

— Middle 60% = 

= N  -- Top 20% Ss sri ie) 

= = © 200 7 © 

@ 50 3 2 £ 3 
a4 c al o oa 

3 $s g 2 8 4 

a oo 

for > o s 
2 £ = 5 pE 

< ® fo) ne] c 

5 0 BS) 3 o & 

=e B 2 a < 

0 10 20 0 10 20 0 10 20 0 10 20 
Horizontal Horizontal Horizontal Horizontal Horizontal 


speed (m s~') 


speed (m s“') 


=h 
e 
= 


— Middle 60% 
- - Top 20% 


jo} 


speed (m s“') 


speed (m s~') speed (m s“1) 


Work (J per kg BM) 


Stride frequency (Hz) 
ive} 
Power (W per kg BM) 


0 10 20 0 10 20 
Horizontal Horizontal 
speed (m s“1) speed (m s“') 


Horizontal 
speed (m s7’) 


| 
ie) 
Tangential acceleration 
(ms) 


Speed change (m s“) 
A 
oe % 


0 10 20 0) 10 20 
Horizontal Horizontal 
speed (m s“1) speed (m s“1) 


k I m n 
6 -- U=0.6 a 5 
S 15 os — w=13, 2100 S 
ga10| Zoo j 2 3 
Sa 2) ne} Ba 
i) = 10 g 50 3%, — Cheetah 
= 5 © -_ = Impal 
ge ¢ 8 ge Lon 
O € e - = 
= 8 i= G is a _ — Zebra 
= P : 
2 Or Wike see te Jeese c ————— o Cheetah:impala ratio 
8 P= = ed%oge == © -10 0 10 o Lion:zebra ratio 
5 1 feoe8eco%e0nec- - Centripetal acceleration (m s~*) 
0 10 
Horizontal Horizontal Horizontal 


speed (m s“1) speed (m s7’) 


Figure 3 | Locomotor performance based on stride parameters. All 
values are averaged per stride or represent the change over a stride and 
where appropriate are per kg body mass (BM). a-e, Accelerating strides. 
a, Positive net work performed in each stride. b, Stride frequency, mean 
of 20% highest power strides and middle 60% of power strides. 

c, Average mass-specific power per stride. d, Increase in speed per stride. 
e, Tangential (forward) acceleration with the curved lines representing a 
stride mean power of 30, 60, 90, 120 and 150 W kg"! with a limit line for 
a coefficient of friction (1) of 1.3. f£-j, as a—e, but for decelerating strides. 
k-n, Turning. k, Centripetal (lateral) acceleration. 1, The relationship 
between speed and turn radius with limit lines for js =0.6 and ju = 1.3. 


two consecutive strides, work per stride is the change in mass-specific 
net horizontal kinetic energy and power per stride is the work per stride 
divided by stride duration. Change in heading is the angle between two 
consecutive stride velocity vectors”. 

Differences in the frequency of maximum effort manoeuvring 
between predators and prey (since predators hunt often and prey are 
rarely hunted) would manifest in different tails for the distributions of 
accelerations for each species. The predator species will have relatively 
heavy tails, that is, higher kurtosis, as more of their observed behaviours 
are associated with rapid accelerations, whereas the more sedentary 
(or at least steadily moving) prey have fewer such observations. Steady- 
state strides were removed by including a threshold on acceleration 
with the threshold determined for each species by the kurtosis of these 
distributions, which resulted in a similar distribution for all species 
(Methods and Extended Data Fig. 3a-c). Qualitatively, the distributions 
for predators and for prey are similar and the 98% percentile approxi- 
mates the limit of the distribution in a reasonably consistent manner 
across runs of all lengths and tortuosity (Extended Data Fig. 4). 


speed (m s“1) 


m, Change in heading compared to horizontal speed. n, Tangential 
compared to centripetal acceleration with ju limits as for 1. n, FE, pure 
forward acceleration; B, deceleration; C, centripetal acceleration; p, polar 
coordinate. In each panel one line per species is shown, which (except in b, 
g) represents the 98th percentile for data in speed bins (each bin contains 
400 data points therefore bin width varies). At the bottom of each panel, 
the ratio of that parameter for cheetah-impala (red circle) and lion- zebra 
(blue circle) is given for each speed bin, same x axis. Dataset comprised 
7,509 strides for 520 runs from five cheetahs; 8,884 strides for 515 runs 
from seven impalas; 15,947 strides for 2,726 runs from nine lions and 
14,089 strides for 1,801 runs from seven zebras. 


Stride parameters were grouped into non-uniform speed bins with 
400 data points in each and the 98th percentile of the distribution was 
determined for each bin (except for stride frequency, for which data 
were further subgrouped on the basis of acceleration performance and 
a linear regression was performed on each subgroup (Methods)). The 
uppermost bin with fewer than 400 data points was ignored. The 98th 
percentile was chosen to account for the different numbers of strides in 
different species and to exclude occasional extreme values’ (Extended 
Data Figs 5, 6). The cheetah-impala pair was more athletic than the 
zebra-lion pair for every metric (Extended Data Fig. 7). 

Predator and prey were compared using a linear model (Methods) 
and test statistics computed under the null hypothesis that predator 
and prey are drawn from the same distribution, except for stride 
frequency, for which, because of species pairing differences, predator 
and prey pairs were compared individually. The ratio of the maximum 
observed performance for cheetah-impala then lion-zebra, along 
with the results of the test comparing predator and prey across species, 
are as follows: predators were 50% and 24% superior at acceleration 
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Figure 4 | Output of the model of predator-prey interaction. a, Predator 
(blue) and prey (red) with initial (upward) velocity and separation. b, After 
one stride the prey can accelerate to anywhere in the red ellipse. Predator 
velocity remains unchanged as there is no prey acceleration in the previous 
stride. The initial positions are shown. c, The red ellipse perimeter is 

the area that prey can reach after two strides at the chosen maximum 
acceleration. The blue ellipse represents the locations the predator can 
occupy (responding to the prey acceleration observed in first stride). The 
area of the prey ellipse that is covered by the predator ellipse is defined as 
the probability of capture. d, As in c for different initial conditions. Rows 
are different initial impala speeds, values in red to the left. Columns are 
different initial separations with values in red below each column. Scale 
for all instances in the bottom left plot are in metres. The black numbers 


(z=3.15, P=0.0016); 73% and 70% better at deceleration (z= —6.61, 
P<0.0001); 100% and 89% more powerful during maximal accelera- 
tion (z= 3.87, P=0.0001); and 100% and 122% more powerful dur- 
ing maximal deceleration (z= —8.07, P< 0.0001). Stride frequency 
was higher for cheetahs than for impalas (z= 3.69, P< 0.001) and 
lower for lions than for zebras (z = —2.31, P=0.041). Across all spe- 
cies, stride frequency at 8m s~' was 6% higher during acceleration 
(P =0.0018) and 5% higher during deceleration (P < 0.001) than 
during steady speed locomotion, as determined by post hoc tests on 
the linear model. 

The 98th percentile of speed was 19.9m s_! for cheetahs, 13.8ms7! 
for impalas, 13.9m s_' for lions and 10.6m s~! for zebras. This was 84, 
78, 67 and 77%, respectively, of the maximum achieved by the third 
fastest individual of each species, which was 23.8 m s_! for cheetahs, 
17.7ms_‘ for impalas, 20.6m s ' for lions and 13.8m s~! for zebras. 
Therefore, predators were faster than their prey and all species 
rarely approached their maximum recorded speed (Extended 
Data Fig. 5). 


Turning performance of predators and prey 

When turning, predators were only slightly superior to prey (z= 2.93, 
P=0.0034): cheetah-impala 15%, lion-zebra 10% (Fig. 3k—n). Turning 
does not require a change in body kinetic energy, but a centripetal 
acceleration of 13m s~ results in a 66% increase in effective weight'® 
and the limbs must shorten and extend in the presence of these higher 
axial forces. This length change can be delivered by passive elastic 
structures within the limb***!, but any associated muscles must deliver 
higher forces at that contraction velocity (equating to a higher power 
requirement)!*°. Reduced centripetal acceleration at high speed would 
indicate a muscle power limit rather than a grip limit for that activ- 
ity'””, however, we found no such evidence for a power limit!® at these 
submaximal speeds. 
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are the initial (optimized for maximum success) cheetah speeds inms_!. 
e, Optimum cheetah speed to maximize overlap (colour key on the right) 
as a function of impala speed (x axis) and starting separation (y axis). 
Histogram at the top shows actual impala speeds at first turn of 10 degrees 
or more (x axis as the main plot) and the vertical histogram shows the 
distribution of actual cheetah speed at the first turn (scale as heat bar). 

f, Proportional overlap (capture probability) as a function of initial speed 
of impalas and starting separation. g, h, Number of hunts required to have 
a 99% chance of prey capture for different performance levels of prey and 
predator. Labels and line colours indicate species pairings for each line. 
Actual accelerations multiplied by the x axis value; the simulation was run 
at 8.75ms ! initial prey speed and initial predator-prey spacing of 2m. 


Figure 3n summarizes the capacity for maximum acceleration in any 
direction, relative to the track of the animal. It shows that these preda- 
tors outperform their prey most markedly during deceleration (bottom) 
and less so during forward acceleration (top) and turning (sides). No 
species showed highest levels of tangential and centripetal acceleration 
in the same stride; the lines are elliptical, which supports a grip-type 
limit (as horizontal accelerations should vector sum to a limit value). 
Forward acceleration performance was maintained by all four species 
at the fastest speeds commonly used (Fig. 3e, j). Power requirements 
for forward acceleration increase with speed (Fig. 3c, h and Extended 
Data Fig. 5), because power is the product of speed and acceleration, 
and if maximal acceleration was lower at the highest speeds, this would 
indicate a potential power constraint!°. A reduction in manoeuvrability 
would result in an animal's trajectory being more predictable, which 
would be disadvantageous for both predator and prey. 


Species differences in experimental data 

Much of the difference observed between predator and prey could 
be attributed to predators having proportionally more muscle and/or 
higher muscle power (Fig. 2d), but that does not provide an expla- 
nation for the large differences that were observed between lion and 
cheetah and between zebra and impala (Extended Data Fig. 7). Hind 
limb muscle fraction of total body mass is fairly consistent across 
species: 17.5-19.8% (Extended Data Table 1), so muscle peak power 
should define the acceleration capacity of the whole animal at moderate 
to high speeds®!?-5. Athletic wild animals are, however, likely to be 
proportionally more muscular than the mostly sedentary domesticated 
animals contributing to Extended Data Table 1 and spinal, trunk and 
forelimb muscle will also contribute to acceleration. The predicted 
in vivo muscle powers of 232-292 W kg! are concomitant with 


the upper, but not lower, limit of observed whole animal powers of 
30-120 W kg“! (Fig. 3e). 
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Cheetah, n = 7,509 


Impala, n = 8,884 


Figure 5 | Orientation of direction of acceleration for each stride 
grouped by species. Circular histogram of frequency/direction of the 
acceleration vector for each stride (steady state strides were removed) 
binned in twelve 30-degree sectors. Upwards is forward acceleration, down 
is deceleration, left and right are turns in that direction. Height of each bin 
from the centre indicates the number of strides. The dataset comprised 
7,509 strides for 520 runs from five cheetahs; 8,884 strides for 515 runs 
from seven impalas; 15,947 strides for 2,726 runs from nine lions and 
14,089 strides for 1,801 runs from seven zebras. 


Carnivores hunt with empty stomachs, whereas prey carry the mass 
of rumen (impala) or hind gut (zebra) contents, which will impinge 
on any performance that is dependent on muscle power or strength 
(as would pregnancy). The differences within the predator and within 
the prey species may reflect that muscles are arranged for different 
roles, for example, for economical walking versus for acceleration and 
hunting or fighting”>*?°, but without contextual anatomical data, 
this is only speculation and the differences are too large to simply be 
attributed to scaling due to animal size (Extended Data Table 2a). Foot 
design and grip may also have a role**. Behavioural factors cannot be 
ruled out, but our data indicate that the highest values were captured 
(Extended Data Figs 5, 6). 


Capture-evasion model description and predictions 

A pursuit predator uses a combination of stealth and speed to get 
close to its prey’? and then the prey evades capture by manoeuvring, 
while the predator attempts to intercept it. The interaction has been 
approached analytically or numerically for continuous processes*”*° 
(for example, air combat manoeuvring), but modelling the probability 
that the predator and prey arrive at the same location becomes increas- 
ingly complex to solve when treated as a discrete process. 

In our model, predator and prey were able to accelerate in any direc- 
tion up to their experimentally derived maximum during each stride, so 
they could go anywhere on the boundary of an approximately elliptical 
area that grew with the subsequent stride (Fig. 4a—c). The predator 
responds to the acceleration of its prey in the preceding stride and we 
modelled the range of initial conditions for which the predator could 
catch its prey within two strides. The acceleration limits for each species 
and direction (impulses per stride) were the observed 98% values of 
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centripetal, positive and negative tangential acceleration divided by the 
stride frequency at that speed (Fig. 3b, e, g, j). The elliptical area pre- 
vented simultaneous maximal centripetal and tangential accelerations 
(Fig. 3n). At higher speeds, acceleration and therefore manoeuvring 
were curtailed as the applied impulses could not cause the animal to 
exceed the 98% maximum speed observed for each species. 

For the prey, the accelerations at the start of the first and second 
stride were the possible accelerations up to maximum (Extended Data 
Table 2b) in any direction. The predator had zero acceleration in the 
first stride so its initial velocity determined its subsequent position 
and it could accelerate in any direction in the second stride (reacting 
to the previous prey acceleration). The area reached by the predator 
was increased by a semi-circular region with a size of half the predator 
body length to account for the physical size of the predator. We define 
capture probability as the fraction of the elliptical area of the prey that 
is covered by the elliptical area of the predator after two strides. 

We plotted the feasible range of initial prey speeds and predator- 
prey spacings for capture after two strides (Fig. 4d and Extended Data 
Fig. 8) and then optimized the initial predator speed for each condition 
to maximize the overlap in position between the predator and the prey 
after two strides. The predator-prey spacing at the beginning of the 
simulation represents less than a stride length in all cases (code used 
for the model written in Python can be found in the Supplementary 
Information). 

The model shows that the prey should avoid the predator by turning 
(lateral acceleration), rather than attempting to increase separation by 
travelling as fast as possible (Fig. 4d). If the prey is moving fast and 
the predator is close (Fig. 4d, bottom left), its best option requires 
rapid deceleration and turning, whereas turning alone becomes more 
beneficial if the predator is further away (and therefore closing at higher 
relative speed, Fig. 4d, bottom right). High prey speeds result in high 
capture probabilities (Fig. 4f), because the prey cannot accelerate 
forwards with or without turning, making its tactics highly predictable 
(captured by optimization of predator speed for overlap), whereas a 
slow moving prey (Fig. 4e, f, left) has a wider variety of escape options 
and is therefore less predictable. Predator and prey indeed use moderate 
speeds (Fig. 4e and Extended Data Fig. 8). 

The predator has the highest chance of success if it is travelling only 
slightly faster than the prey, which enables it to reach many of the loca- 
tions the prey can move to across a broad range of starting speeds (the 
objective function for the optimization, relative capture area, is very 
flat in this region), and its advantage increases with higher prey speeds. 
This is reflected by the observed predator speeds (Fig. 4e and Extended 
Data Fig. 8). 

Figure 5 shows that all species often execute a constant speed turn, 
but that it is rare for either of the herbivore species to accelerate or 
decelerate, whereas predators (especially lions) often undertake decel- 
eration strides, either in isolation or in combination with a turn. The 
preferred accelerations fit with the prey using optimum escape strat- 
egies predicted by the non-overlapping areas in Fig. 4d and tactics for 
which they perform similar to the performance of predators (turning) 
rather than those for which they are outperformed (tangential accele- 
ration and deceleration). With the same lateral acceleration, a prey 
that is moving more slowly than a converging faster-moving predator 
will have an advantageously tighter turn, because the radius is equal to 
v’/lateral acceleration. Commonly observed predator decelerations 
are concomitant with a faster-moving closing predator. More than 
one repetition of the modelled two-stride scenario can occur within 
a single pursuit?—and the overlap-derived success rates are similar to 
those observed for animals when hunting in the wild®!?°”. 


Effect of athleticism on hunting success rate 

We adjusted the acceleration capacity of the predator or prey and reran 
the simulation to obtain capture probabilities for animals of greater 
or lesser athleticism. Unsurprisingly, increased predator performance 
is beneficial, reducing the number of hunts needed to capture prey 
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(Fig. 4g, h). Owing to the power relationship underlying Fig. 4g, h, 
curves steepen when the predator is below 0.8 of its actual performance 
(Fig. 4h), which would tend towards an unsustainably low success rate 
(ignoring other determinants of hunt outcome). Such a reduction could 
be the result of an injury or ageing, with greatest consequences for 
solitary animals. The data also provide insight into preferred prey and 
hunting style: the predicted low success rate for lions hunting impala 
(Fig. 4g, h) is supported by the observation that lions capture impala 
opportunistically rather than in an open pursuit. African wild dogs 
hunt impala*’, but are less athletic than cheetahs*”. Applying the model 
to a single African wild dog hunting an impala’ predicts a success 
rate of 8.2%, which is lower than the actual success rate of 15.5%”. 
This would concur with African wild dogs capturing impalas during 
opportunistic rather than one-on-one pursuit hunts*”. 


Conclusions 

The study shows that overall, the athletic capabilities of the two pursuit 
predators closely match their respective common prey, leading to a sus- 
tainable success rate, survival of both and reflecting an evolutionary 
arms race®’. The predators have higher muscle power, are faster and 
have a greater capacity to accelerate and decelerate than their prey. The 
prey can match their predator’s locomotor capabilities most closely 
through turning manoeuvrability, affording them a critical escape space. 
In evolutionary terms, there may be scope for further development of 
performance, for instance through increasing muscle power, but this 
specialization may be at the cost of locomotor economy, musculosketal 
robustness, or other ecologically relevant factors, such as prey capture 
ability, fighting or the capacity to adapt to a changing world. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 23 June 2017; accepted 2 January 2018. 
Published online 24 January 2018. 


1. Howland, H. C. Optimal strategies for predator avoidance: the relative 
importance of speed and manoeuvrability. J. Theor. Biol. 47, 333-350 (1974). 

2. Moore, T. Y. & Biewener, A. A. Outrun or outmaneuver: Predator—-prey 
interactions as a model system for integrating biomechanical studies in a 
broader ecological and evolutionary context. Integr. Comp. Biol. 55, 1188-1197 
(2015). 

3. Combes, S.A., Salcedo, M. K., Pandit, M. M. & Iwasaki, J. M. Capture success 
and efficiency of dragonflies pursuing different types of prey. Integr. Comp. Biol. 
53, 787-798 (2013). 

4. Domenici, P., Booth, D., Blagburn, J. M. & Bacon, J. P. Cockroaches keep 
predators guessing by using preferred escape trajectories. Curr. Biol. 18, 
1792-1796 (2008). 

5. Domenici, P, Blagburn, J. M. & Bacon, J. P. Animal escapology Il: escape 
trajectory case studies. J. Exp. Biol. 214, 2474-2494 (2011). 

6. Van Valen, L. A new evolutionary law. Evol. Theory 1, 1-30 (1973). 

7. Benton, M. J. The Red Queen and the Court Jester: species diversity and the 
role of biotic and abiotic factors through time. Science 323, 728-732 (2009). 

8. Bro-Jergensen, J. Evolution of sprint speed in African savannah herbivores in 
relation to predation. Evolution 67, 3371-3376 (2013). 

9. Wilson, A. M. et al. Locomotion dynamics of hunting in wild cheetahs. Nature 
498, 185-189 (2013). 

10. Irschick, D. J. & Higham, T. E. Animal Athletes: an Ecological and Evolutionary 
Approach (Oxford Univ. Press, 2015). 

11. Broekhuis, F., Cozzi, G., Valeix, M., McNutt, J. W. & Macdonald, D. W. Risk 
avoidance in sympatric large carnivores: reactive or predictive? J. Anim. Ecol. 
82, 1098-1105 (2013). 

12. Schaller, G. B. The Serengeti Lion: A Study of Predator—Prey Relations (Univ. 
Chicago Press, 1972). 

13. Curtin, N. A., Woledge, R. C. & Aerts, P. Muscle directly meets the vast power 
demands in agile lizards. Proc. R. Soc. Lond. B 272, 581-584 (2005). 

14. Kohn, T. A. & Noakes, T. D. Lion (Panthera leo) and caracal (Caracal caracal) type 
IIx single muscle fibre force and power exceed that of trained humans. J. Exp. 
Biol. 216, 960-969 (2013). 

15. Williams, S. B., Tan, H., Usherwood, J. R. & Wilson, A. M. Pitch then power: 
limitations to acceleration in quadrupeds. Biol. Lett. 5, 610-613 (2009). 

16. Carrier, D. R., Gregersen, C. S. & Silverton, N. A. Dynamic gearing in running 
dogs. J. Exp. Biol. 201, 3185-3195 (1998). 

17. Tan, H. & Wilson, A. M. Grip and limb force limits to turning performance in 
competition horses. Proc. R. Soc. Lond. B 278, 2105-2111 (2011). 


6 


NATURE | VOL 000 | 00 MONTH 2018 


18. Usherwood, J. R. & Wilson, A. M. Biomechanics: no force limit on greyhound 
sprint speed. Nature 438, 753-754 (2005). 

19. Daley, M. A. in Understanding Mammalian Locomotion: Concepts and 
Applications Ch. 11 (ed. Bertram, J. E. A.) 277-306 (Wiley & Sons, 2016). 

20. Wilson, J. W. et al. Cheetahs, Acinonyx jubatus, balance turn capacity with pace 
when chasing prey. Biol. Lett. 9, 20130620 (2013). 

21. Jindrich, D. L, Smith, N. C., Jespers, K. & Wilson, A. M. Mechanics of cutting 
maneuvers by ostriches (Struthio camelus). J. Exp. Biol. 210, 1378-1390 
(2007). 

22. Curtin, N. A. Diack, R. A., West, T. G., Wilson, A. M. & Woledge, R. C. Skinned 
fibres produce the same power and force as intact fibre bundles from muscle 
of wild rabbits. J. Exp. Biol. 218, 2856-2863 (2015). 

23. Rome, L. C., Sosnicki, A. A. & Goble, D. O. Maximum velocity of shortening of 
three fibre types from horse soleus muscle: implications for scaling with body 
size. J. Physiol. (Lond.) 431, 173-185 (1990). 

24. Seow, C. Y. & Ford, L. E. Shortening velocity and power output of skinned 
muscle fibers from mammals having a 25,000-fold range of body mass. 

J. Gen. Physiol. 97, 541-560 (1991). 

25. Hill, A. V. The dimensions of animals and their muscular dynamics. Science 
Progress 38, 209-230 (1950). 

26. West, T. G. et a/. Power output of skinned skeletal muscle fibres from the 
cheetah (Acinonyx jubatus). J. Exp. Biol. 216, 2974-2982 (2013). 

27. Crow, M. T. & Kushmerick, M. J. Chemical energetics of slow- and fast-twitch 
muscles of the mouse. J. Gen. Physiol. 79, 147-166 (1982). 

28. Bartlam-Brooks, H. L., Bonyongo, M. C. & Harris, S. How landscape scale 
changes affect ecological processes in conservation areas: external factors 
influence land use by zebra (Equus burchelli) in the Okavango Delta. Ecol. Evol. 
3, 2795-2805 (2013). 

29. Schmidt-Nielsen, K. Desert Animals. Physiological Problems of Heat and Water 
(Clarendon, 1965). 

30. Wilson, A. M., McGuigan, M. P, Su, A. & van Den Bogert, A. J. Horses damp the 
spring in their step. Nature 414, 895-899 (2001). 

31. McGuigan, M. P. & Wilson, A. M. The effect of gait and digital flexor muscle 
activation on limb compliance in the forelimb of the horse Equus caballus. 

J. Exp. Biol. 206, 1325-1336 (2003). 

32. Pasi, B. M. & Carrier, D. R. Functional trade-offs in the limb muscles of dogs 
selected for running vs. fighting. J. Evol. Biol. 16, 324-332 (2003). 

33. Carrier, D. R., Anders, C. & Schilling, N. The musculoskeletal system of humans 
is not tuned to maximize the economy of locomotion. Proc. Natl Acad. Sci. USA 
108, 18631-18636 (2011). 

34. Wynn, M. L., Clemente, C., Nasir, A. F. A. A. & Wilson, R. S. Running faster causes 
disaster: trade-offs between speed, manoeuvrability and motor control when 
running around corners in northern quolls (Dasyurus hallucatus). J. Exp. Biol. 
218, 433-439 (2015). 

35. Merz, A. The homicidal chauffeur. A/AA J. 12, 259-260 (1974). https://doi. 
org/10.2514/3.49215 

36. Getz, W. & Pachter, M. Two-target pursuit-evasion differential games in the 
plane. J. Optim. Theory Appl. 34, 383-403 (1981). 

37. Hubel, T. Y. et a/. Energy cost and return for hunting in African wild dogs and 
cheetahs. Nat. Commun. 7, 11034 (2016). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We thank S. Amos for fabricating collars, N. Jordan and 

G. Gilfillan, M. Claase and N. Terry and BPCT research assistants for working 
with us in the study area and M. Flyman (DWNP) for his support and 
enthusiasm; J. Usherwood, R. Bomphrey and A. R. Wilson for comments on 

the manuscript; EPSRC (EP/H013016/1), BBSRC (BB/JO18007/1) and 

ERC (323041) for funding. The Botswana Predator Conservation Trust was 
supported by private donors, Tusk Trust and the Cincinnati Zoo. Work was 
approved by RVC Ethics & Welfare Committee (RVC 2013 1233) and Botswana 
Department of Wildlife and National Parks Research Permits were held by 
J.W.M. and A.M.W. (EWT 8/36/4 plus additions) and a Botswana Veterinary 
Registration held by A.M.W. Tissue shipping was covered by CITES, Botswana 
export, Botswana National Veterinary Laboratory approval, South African transit 
and UK DEFRA import permits. 


Author Contributions A.M.W., T.Y.H. N.A.C., R.C.W. and T.G.W. conceived, 
designed and led the study. K.A.G., J.W.M., H.L.A.B.-B. and E.B. organized field 
work, monitored animals and downloaded data. A.M.W. performed veterinary 
procedures, J.C.L. and A.M.W. designed and built collars. R.D., M.L., N.A.C. and 
T.W. carried out muscle experiments and interpreted the muscle data. T.Y.H., 
O.P.D., T.G.W. and A.M.W. analysed data. S.W. created the model and carried out 
statistical analysis. A.M.W. wrote the paper with input from all authors. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the paper. 
Publisher's note: Springer Nature remains neutral with regard to jurisdictional 
claims in published maps and institutional affiliations. Correspondence and 
requests for materials should be addressed to A.M.W. (awilson@rvc.ac.uk). 


Reviewer Information Nature thanks A. Biewener and the other anonymous 
reviewer(s) for their contribution to the peer review of this work. 


© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


METHODS 


Data reporting. No statistical methods were used to predetermine sample size. 
The experiments were not randomized and the investigators were not blinded to 
allocation during experiments and outcome assessment. 

Animals. All collared animals were located in northern Botswana with largely over- 
lapping ranges (Fig. 1f). Animals were immobilized by free darting from a vehicle 
or helicopter mostly by A.M.W., using 80-100 mg ketamine and 2 mg medeto- 
midine for cheetahs; 60 mg ketamine, 25 mg tiletamine hydrochloride, 25 mg 
zolazepam hydrochloride (as 50 mg zoletil, Virbac), 2 mg butorphanol tartrate 
and 6 mg medetomidine for lions; 1.5 mg thiafentanil oxalate, 2 mg butorphanol 
tartrate and 1,700 IU hyalase for impalas; and 7 mg etorphine hydrochloride, 50 mg 
azaperone and 1,700IU hyalase for zebras. The reversal of immobilization of 
herbivores was done using diprenorphine or naltrexone at the end of the procedure 
and carnivores with atipamezole at 60 min after darting. While sedated, front and 
hind leg and body lengths were recorded. Collar data were downloaded by radio 
link every few weeks to a ground vehicle and collars were monitored. All animals 
were adult, nine lions (two male, seven female), five cheetahs (two male, three 
female), seven zebras (seven female), seven impalas (six male, one female). The 
lions and cheetahs were part of other ongoing projects in collaboration with the 
Botswana Predator Conservation Trust (http://www.bpctrust.org). Programmable 
drop-offs (two models, 108 g, Sirtrack Ltd or 50 g, Biotrack) were attached to 
the zebra and impala collars, respectively. Two drop-off units failed and collars 
were retrieved by re-darting. 

Data were collected between April 2012 and November 2016 (cheetahs 

(June 2012-April 2013), lions (April 2012-June 2013), zebras (November 
2014- September 2015), impalas (July-November 2016)). A subset of the data from 
cheetahs (367 out of 520 runs) were analysed in ref. 9. 
Muscle fibre measurements. Biopsies were taken from the biceps femoris muscle 
by A.M.W.,, using a Bergstrom needle or conchotome forceps after collar placement. 
Animals were clipped, sterility was ensured and the biopsy site was treated with 
local antibiotics (200 mg cloxacillin, 75 mg ampicillin, Curaclox LC) and the animal 
was given analgesia (finadyne or metacam). Five male impala that had been killed 
for meat on a game ranch were dissected and provided additional muscle samples. 
Muscle samples were skinned by 30 min of immersion in ice-cold relaxing solution 
containing 2% Triton X-100 and exposed to a pH-6 relaxing solution to inactivate 
any foot-and-mouth disease virus. Triton X-100 was washed out with fresh relaxing 
solution and samples were immersed in 500 mM trehalose containing 0.5% 
glycerol”, frozen in liquid nitrogen and stored in an IATA-approved dry-shipper 
(Biotrek 3 Statebourne Cryogenics) for transport to the United Kingdom. In the 
United Kingdom, biopsies were stored at —80°C. Periodically, individual biopsies 
were thawed and had cryopreserving trehalose replaced with a relaxing solution. 
Our previous work showed that biopsies stored for 20 months using this protocol 
showed no discernible loss of mechanical power”. Thawed biopsies were stored 
at —20°C in a relaxing solution made up in glycerol and used for fibre preparation 
and testing within four weeks. 

Fibre fragments were first suspended while in relaxing solution between the 
motor and force transducer of a 600A permeabilized-fibre apparatus (Aurora 
Scientific). T-shaped aluminium clips were attached to fibre ends and used to 
suspend fibres from steel wire hooks that were glued with shellac to the motor and 
transducer. Fibres were visualized using a 900B digital camera (Aurora Scientific). 
The camera image was used to set the sarcomere length (SL) of a fibre fragment to 
between 2.5 and 2.6 1m. Fibre length (L,), depth and width were then measured 
in mm. The fibre cross-sectional area was calculated for each fibre, assuming an 
elliptical shape. 

Single skinned fibres were activated by temperature jump, from 1°C to 25°C 
(Extended Data Fig. 2a), using approaches similar to those previously described”. 
The composition and ionic strength (200 mM) of the various solutions was as pre- 
viously described”. To activate a fibre, it was immersed consecutively in solutions 
for low-temperature pre-activation (for 45s), low-temperature activation (for 4s), 
high-temperature activation (6s) and high-temperature relaxation. The example 
in Extended Data Fig. 2a shows the time courses of solution changes and force 
responses for an impala fibre, starting from the final 3s of cold-temperature pre- 
activation. The force baseline at 0.7 L, was recorded in high-temperature relaxing 
solution before re-setting L, to the original starting length and checking SL. 

The standard procedure to measure fibre power and determine peak power was 
modified to perform four different force-control events during each 6-s activation 
to deliver more data per fibre”. In brief, after temperature jump to 25°C, force 
developed to a plateau at constant length (isometric force), and the fibre was then 
clamped for 20 ms to a predetermined fraction of peak isometric force—the actual 
force achieved in the first force clamp was calculated by the 600A based on the 
difference between baseline force (measured and stored within a 600A protocol 
just before 1°C activation) and isometric force (measured and stored within a 
600A protocol just before onset of a force clamp). The shortening velocity during 
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force clamp was measured and used to calculate power output. The fibre was then 
released to slack length in order to re-measure the baseline force and it was then 
lengthened to L, over a period of 5 ms. This step avoids a high eccentric force 
transient during lengthening. The baseline measurement was saved again in the 
600A protocol and used to compare with the measurement of stable isometric 
force achieved after lengthening the fibre to L,—the force attained in the next force 
clamp was based on the difference between the newly saved values of baseline and 
isometric force. Four different force-control events were conducted during the 6-s 
activation at 25°C (see Extended Data Fig. 2a). Examples of a force clamp and of 
the fibre-length changes required to hold force constant are shown in Extended 
Data Fig. 2b, c. Relaxation and a final force baseline-check were also conducted at 
25°C (Extended Data Fig. 2a). Three activations provided up to twelve different 
force-clamp measurements in order to quantify a power versus force relationship 
and peak power for each fibre (see below). 

A fibre was counted as ‘tested’ if it did not break on the test apparatus, and if test 
conditions (solution temperature and chemistry) were maintained as prescribed 
in the experimental design. A fibre was counted as ‘included’ in the mechanical 
tests if the maximum isometric force >75 kPa, and during the repeated activations, 
isometric force for each test remained >80% of the peak isometric force observed. 
For each fibre, we conducted three activations, each with four force-control, or 
shortening, events in order to collect, at most, 12 points for a power-force curve 
fit. An individual data point (that is, a single force-control event) could be rejected 
(either because of poor (for example, unstable/oscillating) force during fibre- 
shortening or because of low (<80% of maximum) isometric force), but exclusion 
ofa data point on this basis will not necessarily have caused the fibre to have been 
rejected, unless the spread of usable data points was insufficient for curve fitting. 
Of the 209 fibres initially tested with apparent success, 35 were excluded (three 
out of 40 cheetah fibres, 14 out of 64 lion fibres, 14 out of 71 zebra fibres, four out 
of 34 impala fibres). 

The data for each fibre were analysed as described previously”. 


pent (: 7 5) 
FO ie ‘o 
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Equation (1) describes the dependence of relative power (Q= power / (Fisom X Lo); 
ins~!) on relative force (force during shortening / Fisom; units are dimensionless). 
Peak relative power (Qmax) was found by fitting a line to the data by adjusting three 
parameters: Qmax, the force intercept (R,) and force at peak power (Fan.)- AN 
example plot and best-fit curve is shown in Extended Data Fig. 2d. Peak power in 
W kg 1 is obtained by multiplying Qmax by maximum isometric stress in kPa and 
dividing by fibre density, 1.064. g mI"! (ref. 38). 

After mechanical tests, the ‘low-performing’ single skinned fibres were pinned 
onto a gelatine base in cryomolds, flooded with OCT (Tissue-Tek) and frozen in 
liquid nitrogen. Sections (8-j1m thick) were cut and immunostained with mouse 
anti-MHC fast monoclonal antibody (MY-32, 1:1,000; ab51263, Abcam) for type-II 
fibres, and mouse anti- MHC slow monoclonal antibody (1:50; MAB1628, Merck 
Millipore) for type-I fibres. 

Muscle data statistics. A linear mixed-effects model was fitted in R (R Foundation 
for Statistical Computing) for peak power, velocity at peak power, stress at peak 
power and isometric stress against a factor distinguishing predator and prey 
with the interaction of this factor with a categorical variable ‘performance 
classification’*’. Within the factor distinguishing predator and prey, we included 
a nested random effect by subject and fibre. The residuals of this model exhibited 
heteroscedasticity and so the variance of the error term was allowed to vary by 
performance classification. General linear hypothesis tests were then performed. 
Temperature. Muscle power is highly temperature-dependent”® and in a previous 
study”®, using their data and literature data, it was shown that for the temperature 
range of 20-35 °C, a temperature coefficient (Qio, ratiometric increase in rate with 
a temperature increase of 10°C) of 2.3 is appropriate*”*? (see figure 7 of ref. 26). 
A Qio of 2.3 was used to predict powers at a body temperature of 38 °C. 

Collar design. All collars were designed and constructed in-house and are 
described in detail in ref. 9 and in the Supplementary Information. 

Ethical guidelines suggest a collar mass limit of 5-10% of the body mass” to 
minimize the effect on the animal; our collars were below that threshold at 0.3 to 
1.0% (collar mass: cheetah, 340 g; lion, 970 g; zebra, 930 g and impala, 450g. Drop- 
off mechanisms were 108 g (Sirtrack) and 50 g (Biotrack)). The electronics package 
was similar in all collar versions with almost identical functionality. 

Signal processing. GPS-INS processing was used to reduce noise and improve 
precision for the position and velocity analysis, as well as increasing the temporal 
resolution of the data. GPS and IMU measurements were fused? using a 12-state 
extended Kalman filter* in loosely coupled architecture written in MATLAB (The 
Mathworks). The total state formulation used propagates position, velocity and 
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orientation states with time using the IMU measurements in a simplified form of 
the strap-down inertial navigation equations“. The associated process noise was 
estimated from the known error characteristics of the inertial sensors used. GPS 
position and velocity updates were used as measurement updates, and receiver 
accuracy data for each fix used to estimate measurement noise to appropriately 
weight the GPS to the inertial solution. 

The filter was run in reverse time from the last GPS observation of each run 

to the beginning of the buffered inertial data. During the short time period in 
which only inertial data was present (delay between trigger and first GPS fix), the 
filter propagation was equivalent to open-loop inertial navigation. The filter was 
initialized using the last GPS position and velocity data, and Euler angles assumed 
zero with covariances appropriate for the uncertainty in that assumption. A Rauch- 
Tung-Striebel smoother* was then applied in forward time on the Kalman-filtered 
data. This is equivalent to combining backward and forward solutions, effectively 
halving the open-loop INS integration period between GPS observations. In cases 
for which it was not possible to reconstruct the period before the first GPS obser- 
vation (time too long or GPS accuracy insufficient), runs start at medium speeds 
rather than very low speeds. 
Calculation of speed and stride times. Vertical accelerations were used to deter- 
mine stride times. A zero phase band pass Butterworth filter (fourth order) was 
applied with cut-off frequencies of 1 Hz and 6.6 Hz (twice the maximum stride 
frequency in the cheetah and impala). A peak detection function was used to 
detect peaks with a minimum period of 0.25s between peaks and a minimum 
peak height of 0.1 g. 

Species-specific gait parameters, such as transition speeds and expected stride 

frequencies for walking and trotting (based on ref. 46), were used to remove double 
peaks in strides in symmetrical gaits. Horizontal stride speed was derived from 
the Kalman-filtered velocity averaged over strides in order to remove the effects 
of speed fluctuation through the stride and collar oscillation relative to the centre 
of mass. 
Tangential acceleration, change of heading and centripetal acceleration over 
stride. Stride times were used to calculate tangential (fore—aft) acceleration, 
centripetal (turning) acceleration and change in heading between strides. The 
displacement vectors between consecutive strides were then calculated: 


U;= P;— Pi-1 (2) 


where P; is the two-dimensional position at stride i. 
Change of heading (AQ) was calculated from the angle between the two vectors: 


Ui. x U; 


Aé;= sin7! 
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(3) 


Angular velocity (w;) was derived by dividing the change of heading by the time 
between mid-stride positions AT: 


_ AM 
AT 


The tangential or fore—aft acceleration (a;,;) and centripetal acceleration (a,,) 
were then computed from mid-stride speeds vj: 


(4) 


Wj 


ayi= ee (5) 


2 
Ha, = WV (6) 


Yj 


agi= 


Negative values for tangential acceleration represent deceleration. Absolute val- 
ues were used for centripetal acceleration, equalling right and left turns. For visual 
purposes the data in Fig. 3n were mirrored around the vertical axis. 

Mass-specific centre of mass (COM) stride work (net COM kinetic energy 
change in a stride) was calculated as change in speed over a stride multiplied by 
stride average speed. Mass-specific COM power was calculated as the dot product 
of stride-averaged tangential acceleration and stride-averaged velocity (that is, 
multiply forward acceleration by forward speed): 


Pyi= ivi (7) 


Calculation of grip limits. Grip limits are shown in Fig. 3. Friction poses a limit 
on acceleration and is the product of friction coefficient 1 and force normal to the 
surface (based on acceleration due to gravity, g). Therefore the maximum total 
horizontal acceleration amax is limited to: 


Amax = HS (8) 


where @max is the resultant (combination) of tangential and centripetal acceleration: 


Ac,max = LS 
At,max = HY 


(9) 


Maximum turning speed Vmax depends on friction, gravity and turning radius and 
is calculated based on equations (6) and (9): 


Vmax = ./ MST 


Calculation of stride frequency. Regression lines were fitted to stride frequency 
versus speed data at running speeds. Sections with running data were identi- 
fied using an unsupervised clustering algorithm on three features derived from 
windows of accelerometer signals (4-s long)*°. Features were chosen on the 
basis of domain knowledge and were the s.d. of the horizontal and vertical axis 
accelerometer signals and an autocorrelation estimate of the stride frequency**. 
Features were normalized to have zero mean and unit standard deviation before 
they were clustered using the k-means algorithm. The number of clusters 
was determined using the Davis—Bouldin criterion*” and human inspection. 
Subsequently, the sections identified to contain running data were isolated and 
vertical acceleration was used to determine stride times (see ‘Calculation of speed 
and stride times’), stride frequency was calculated from the time between accel- 
eration peaks. Regression lines were calculated for the subgroup from each bin 
representing the middle 60%, the highest 20% of positive and highest 20% of 
negative power (Fig. 3b, g). 

Maximum performance analysis. Extracting values that reflect maximum per- 
formance carries the risk of choosing outliers generated by non-Gaussian GPS 
noise rather than realistic values. Previous work reduced the risk of overestimating 
performance by weighting stride parameters, such as stride speed and accelerations, 
by the previous and following stride”*”“*, Here we chose a different approach, not 
weighting, but calculating the 98th percentile for each of a number of bins (Fig. 3) 
in order to also address the effect of different sample sizes and accelerations 
that were not sustained for three consecutive strides. In addition, obvious errors 
(speeds >30m s_! and total stride averaged accelerations exceeding a magnitude 
of 20m s-*) were removed from the dataset. 

An inherent issue with comparing the performance of different species lies 
in their different movement patterns, with lion and zebra having a considerably 
higher proportion of straight, constant low speed strides than impala and cheetah. 
In order to extract manoeuvring strides, a cut-off based on the magnitude of the 
total horizontal acceleration (combined tangential and centripetal acceleration) 
was performed. This cut-off could not be universal, because different animals 
had different amounts of low-speed steady-state behaviour in their accelerometer 
traces. This manifested itself in large differences between species in kurtosis of the 
acceleration-distribution histograms. To address these differences a species-specific 
cut-off was used. To ensure that this cut-off still gave comparable results for the 
different species, the characteristic scale of the kurtosis for each distribution was 
estimated using: 


(10) 


Sa= oakll4 (11) 
where s, is our characteristic scale, 7, is the standard deviation and k,, is the 
Pearson's kurtosis, all for species a. If a cut-off of c, was used for one species, then 
we can calculate the cut-off for species (3 by: 


(12) 


The effect of this cut-off on the distribution of total horizontal acceleration is 
shown in Extended Data Fig. 3a, b. 

In Fig. 3n, tangential acceleration is plotted against centripetal accelerations. 
The Cartesian coordinates were transformed into polar coordinates in order to bin 
the data. Calculations were performed on absolute centripetal acceleration values 
to boost data point numbers in bins and then mirrored on the vertical axis; the 
semicircle was divided into a total of six bins. 

The cut-off was adjusted, so that the number of data points in a bin was at least 
200 for all species. The cut-off was determined by the impala, which had the lowest 
number of data points. 

The parameters were plotted versus horizontal speed (except for stride 
frequency) and the 98th percentile was calculated for each of a number of speed 
bins for which the width was defined by the requirement that each bin should 
include 400 data points. The final (highest speed) bins with less than 400 data 
points were discounted. In Fig. 3 a moving average of three bins was applied to 
all data except Fig. 3n. Data were interpolated to allow the calculation of species 
performance ratios (Fig. 3) at 1ms_! speed positions. 
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Statistical analysis. The maximum performance of the predator and prey were 
compared using a set of linear models of maximum positive and negative power, 
positive and negative tangential acceleration and absolute centripetal acceleration. 

The maximum performance of each individual was quantified by taking the 
98th percentile of the positive and negative tangential acceleration and absolute 
centripetal acceleration of each individual. 

Negative and positive power covaried with speed and was binned by speed 
as above and the 98th percentile within each bin was computed for each subject 
within a species. A linear regression was then performed and the predicted power 
at 8ms_! calculated for each subject. 

Linear models were fitted to these data using restricted maximum likelihood, 

with the maximum powers and accelerations as dependent variables, against a 
factor for predator versus prey and a factor for each pairing (zero for cheetah- 
impala and one for zebra-lion). Models with an interaction term between these 
two factors were fitted, but comparing these models to those previously described 
(this time fitted using maximum likelihood) indicated the interaction term was 
superfluous (effect sizes were small and associated P values not significant). The 
interaction term was therefore dropped from the analysis in all models, except 
when comparing stride frequencies for which there was a substantial interaction 
term (effect sizes large and associated P values significant). Here, the model was 
fitted by individual species pairs. Owing to the presence of heteroscedasticity, the 
error term was allowed to vary for each species. 
Chase-evasion model. The model combines the observed acceleration capacity 
with a maximum speed constraint to produce possible position profiles for pred- 
ators and prey in the subsequent two strides of a chase. We simulate the possible 
positions of the prey given the prey’s initial speed. We then do the same for the 
predator, optimizing the predator’s initial speed to give maximum overlap in final 
positions of the predator and prey. 

We begin with the observed maximum accelerations for our subjects (Extended 
Data Table 2b). We approximate the possible impulsive accelerations of the animals 
by assuming they have a maximum tangential acceleration forward, a, a maximum 
reverse tangential acceleration, aj, and a maximum centripetal acceleration, d¢. 
The profile of possible accelerations is assumed to be two half-ellipses with the 
semi-minor axis along the direction of motion. The top ellipse has semi-minor 
axis radius a;, the bottom ellipse has semi-minor axis radius ay, and both have a 
semi-major axis of length a.. 

The animals are assumed to have a maximum possible speed, v. This places a 
further constraint on the possible profile of accelerations as no acceleration can 
result in a speed above this maximum. 

To find this constraint, we assume that a predator and its prey are galloping at 
a common stride frequency (Fig. 3b, g) and phase, and that the bulk of the impulse 
that they can achieve in a stride is performed in a short duration (stance). On any 
given stride the animal can apply an impulse to change direction, subject to the 
constraint that the resulting speed cannot be greater than the animal’s maximum 
speed, vy. If the animal is at a speed v along a unit direction? and an impulsive 
acceleration agi + ai with j perpendicular to? is to be applied then the resulting 


speed is: 
+3] «(3 


where f is the stride frequency. This must be less than vy. This implies a pair of 
quadratic relations between ap and a, subject to v and vy of the form: 


(14) 


(15) 
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The simulation allows our subjects to accelerate to anywhere within the area 
formed by the union of the area above the negative root of this equation, below 
the positive root, and within the two half ellipses previously mentioned. 

We note that the possible acceleration profile depends both on the position of 
the animal and its current speed (an animal that is slow will not be constrained 
by its maximum possible speed, whereas one going at its maximum speed cannot 
accelerate forward). This means that simulating the animal's possible positions 
forward in time increases in complexity with each stride taken, as both the new 
position of the animal and the new speed must be retained. As such we confined 
ourselves to simulating two strides forward from our starting conditions; that is, 
we are only concerned with strategies for the predator and prey at the very end 
of a chase. 

We assume that the prey performs an evasive acceleration on the first stride, 
while the predator continues to chase without changing velocity. On the second 
stride, the prey again accelerates, and now the predator also has the ability to 
react to the acceleration it observes in the first stride. We ran 100 such simula- 
tions for starting separations varying from the maximal separation that makes 
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capture possible within two strides down to half a predator length separation 
(cheetah = 0.66 m”, lion = 0.92 m®). If the prey and predator are closer than this, 
then the predator is already close enough for prey capture. 

For a given prey speed and initial predator-prey separation, we find that 
predator speed, which maximizes the capture probability by means of a Nelder- 
Mead simplex optimization, is subject to the constraint that the initial predator 
speed must be greater than or equal to the initial prey speeds. Owing to ambiguity 
in the solution space, a small penalty term encouraging faster speeds from the 
predator was added in the form eps  Vprea, With eps = 10~®. This ensured that for 
data with a range of optimal best speeds for the predator, the fastest was selected. 
This had no effect on the value of the optima up to four significant digits. 

To test how a change in predator or prey performance influences hunt outcome, 
we adjusted the performance by multiplying the maximum recorded tangential 
and the centripetal accelerations of the prey or predator by a number ranging from 
0.6 to 1.4 to deliver values to insert into the model for animals of greater or lesser 
athleticism, respectively. This number is the x-axis performance adjustment in 
Fig. 4g, h and rerunning the simulation to obtain capture probabilities. Maximum 
speed was not adjusted. 

List of symbols. Muscle studies. Fisom; fibre isometric force; F, fibre force during 
shortening; SL, fibre sarcomere length; L,, fibre length when the sarcomere length 
is set to 2.55 1m; Q, fibre relative power; Qmax, fitted maximum relative power; Fo, 
fitted force intercept on a distribution of Q against F / Fisom; FQmax fitted relative 
force at maximum power; CSA, fibre cross-sectional area; Qjo, the ratiometric 
increase in rate with a 10°C temperature change. 

Locomotion and model. i, stride number; P;, two-dimensional position; Uj, 
two-dimensional position difference between subsequent strides; A6j, signed 
change of heading; w;, heading angular velocity; AT, sampling interval; a, total 
horizontal acceleration; a, tangential or forward acceleration, a, tangential reverse 
acceleration; a,, centripetal acceleration, ao and a, are generic accelerations; r, 
turn radius; v, stride-averaged horizontal speed, vmax, maximal turning speed, vm, 
maximum speed; P,, mass-specific fore—aft power; 1, coefficient of friction; m, 
body mass; g, gravity; « and B, species indices; s, characteristic scale; 0, s.d.; k is 
the Pearson's kurtosis. 

Code availability. Python code for the simulation predator-prey model and data 
are available as Supplementary Data files. 

Data availability. The authors declare that all relevant individual fibre and 
individual stride (processed, not raw) data supporting the findings of this study are 
available as Supplementary Data or as Source Data. Any further data are available 
from the corresponding author upon reasonable request. 
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Species Number of Number Total Strides used Number of strides Distance per run (m) 
individuals of trials strides (non-steady state) per run (meanss.d) (meanss.d.) 
Cheetah 5 520 23871 7509 50 + 25 1574122 
Impala 7 515 22491 8884 47+ 33 1514149 
Lion 9 2726 101110 15947 39 + 22 80 +76 
Zebra 7 1801 64952 14089 38 + 39 69 +91 


f 


Extended Data Figure 1 | GPS data summary. a, Example manoeuvring 
sequence for a cheetah showing position based on fused GPS-IMU data 
(250 Hz) colour-coded according to speed and segmented for clarity (1-5, 
duplicated in b-e). b, Speed based on fused GPS-IMU data (250 Hz). 

c-e, Stride-wise values for speed (averaged over stride), tangential (fore- 
aft) acceleration (change in stride speed/stride durations) and centripetal 


(lateral) acceleration ((change in heading/stride duration) x stride speed). 
f, Details on the animals used and datasets collected. Reduced dataset 

of non-steady state stride used for analysis of maximum performance. 
Note, the number of strides and distance per run was based on all strides 
(steady-state included). 
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Cheetah 37 237.7 +75 101.6 £5.2 1.81 £0.06 59.2+2.3 
Lion 50 220.0 +8.6 96.6 +5.3 1.80 + 0.05 56.2+2.7 
Both predators 87 227.5 +5.9 98.8 +3.7 1.80 + 0.04 57.5418 
ae ming fibres only 76 234.1 46.2 107.9 +3.1 1.91 £0.03 60.5+1.8 
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Extended Data Figure 2 | Muscle data summary. a-c, Time course of 
stress (force) development in a single skinned impala fibre, showing 
transition of the fibre through pre-activation, activation and relaxation 
solutions, and stress development after temperature jump (T-jump) from 
1 to 25°C. The sample rate was 5 kHz. The grey noisy parts of the stress 
trace denote periods of solution change. The four downward ‘spikes’ in 
the stress record (at 9, 10, 11 and 12s) are distinct periods of force control, 
where the fibre length was first rapidly reduced from L, and then reduced 
at an appropriate rate to maintain force at pre-defined sub-maximal levels. 
The broken-line box in a surrounds the first episode of force control and is 
presented in b and c on an expanded time scale. b Relative force (F/Fisom) 
was reduced to 40% of the maximum for 20 ms, where F indicates force 
during shortening, Fisom indicates isometric force. Isometric force (Fisom) 
and the force during force clamp (F) were recorded as average values for 
the central 10-ms intervals (vertical lines). A force measurement, Fisom, 
was recorded just before each of the four force-control events and used in 
the calculation of F/Fisom. ¢, Shortening speed (in L, s_') was derived from 
the rate of change in fibre length during each force clamp. At the end of the 
force clamp, fibre length was ‘quick-released’ to a slack length (70% of L,), 
where force was reduced to the zero baseline. After 10 ms at slack length, 
the motor lengthened the fibre back to the starting length (L), isometric 
stress was re-established (as shown in a) and another force control event 
was initiated. d, Twelve points on a power-force relationship could be 


obtained from three temperature jump activations of a single fibre. The 
curves were fitted (see Methods) to give relative power (power/FisomLo 
(in s~')) as a function of relative force (F/Fisom). Expressing both variables 
in relative units is important for the curve-fitting process, mainly because 
the measurements of Fisom often vary between and within activations; in 
the example shown in a there was a small reduction in Fisom through the 
activation at 25°C. e, Peak isometric force relative to fibre cross-sectional 
area (CSA) for fibres from the four species. f, Power output relative to fibre 
volume. There was a distinct subpopulation of low-performance fibres 
(mostly from lion and zebra) that displayed lower power at a given fibre 
volume. Fibres with a shortening velocity at peak power of <1.35 Lys"! 
(see also Fig. 2c) were classified as low performance. g, Peak power 
relative to stress at peak power. The low performing fibres also had stress 
at peak power values that were relatively low—the data points below the 
thin dashed black line have velocities of shortening <1.35 Lo s-'.h, The 
variability in stress at peak power was similar across the species tested. 

i, Details about the muscle fibres. Mean (+s.e.m.) mechanical features 
for single skinned skeletal muscle fibres from biceps femoris of cheetah, 
lion, impala and zebra. Mean values are also categorized for the predator 
and prey groups, and further as the high-performing sub-groups of fibres 
(high-performing fibres had optimal shortening speeds >1.35 Lys’, see 
main text and Extended Data Fig. 2f, g, i). 
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Extended Data Figure 3 | Predator-prey run comparisons. 

a, b, Histograms of all strides (a) and extracted non-steady state strides 
(b), for which the cut-off was species-specific based on their kurtosis. The 
x axis is the normalized and squared horizontal acceleration in arbitrary 
units (that is, combined tangential and centripetal). In b, on the x axis, 
the cut-off is zero and the 98th percentile is one. Cheetah, blue; impala, 
red; lion, purple; zebra, yellow. Removing steady-state strides delivers a 
similar distribution tail for all four species. This is critical for deriving an 
appropriate 98th percentile, as this should be equally representative in 
all four species. c, Histogram of maximum stride parameters recorded in 
each run (speed, centripetal and tangential acceleration) for each species. 


Max. centrip. acceleration (ms) Max. tang. acceleration (ms) 


Colour-coded by individuals, n is the number of runs used for data 
extraction. One concern of the comparison between predator and prey 
species is the potential lack of high-performance runs in prey species due 
to the low number of actual one-on-one chases. However, the distribution 
of the performance data shows that the cheetah and impala data include 
a considerably higher proportion of high-performance runs, whereas 

the lion and zebra dataset includes a large percentage of slower runs. 
Recognizing that the species differ in run characteristics (motivation, 
proportion of steady-state versus non-steady-state strides), we removed 
steady-state strides, based on the species-specific kurtosis, resulting in a 
comparable distribution in all four species (see Methods). 
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Extended Data Figure 4 | Performance metrics separated by individual 
and species plotted against run distance and against run tortuosity. 
Maximum accelerations and speeds were extracted from each run and 
displayed versus distance covered during the run and versus tortuosity 

of the run. Tortuosity is the ratio of distance covered in a run to net 
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displacement (distance between start and end of the run). Markers are 
colour-coded per individual, dashed black line maximum values are based 
on 98th percentile. Number of runs (data points) are given in Extended 
Data Fig. 1f. Cheetah, 520 runs; impala, 515 runs; lion, 2,726 runs; zebra, 
1,801 runs. 


© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


ARTICLE 


ig. OO Se ee | be RO SS ren 
— 
9° os, Saag, sir, aie, 
= 
— cheetah 
— impala 
— lion 
—— zebra 
10 15 20 25 
Horizontal speed (ms"') 
150 
100 
0 
o 
= 
{e} 
a 


— cheetah 


— impala 
— lion 
—— zebra 


0 5 10 15 20 250 5 


10 


15 20 250 5 10 15 20 25 


Horizontal speed (ms*) Horizontal speed (ms) Horizontal speed (ms) 


Extended Data Figure 5 | Work and power analysed for each species. 
a, Cheetah. b, Impala. d, Lion. e, Zebra. a, b, d, e, Dots indicate the data 
points and the line marks the 98th percentile for data in speed bins as 
shown in Fig. 3a, b, d, e. Markers are colour-coded by individual, the 
solid black line is the 98th percentile. c, f, Comparison of the predator- 
prey pairs. c, Cheetah-impala. f, Lion—zebra. g, Comparison of the 
predator species (lion-cheetah). h, Comparison of the herbivore species 
(impala-zebra). i, The 98th percentile for all four species. c, f-i, Data are 
colour-coded by species (key is shown in i). In all four species maximum 
negative power was similar to maximum positive power. Muscle stress 


can be considerably higher when performing negative work than positive 
work’! and a 60% higher fascicle power in lengthening (rather than 
shortening) has been reported”!, so mass-specific muscle power can be 
much higher in deceleration”. Body geometry relative to the ground 
reaction force vector or grip may limit the attainable horizontal ground 
reaction force! and the muscles need to be arranged to lengthen while 
experiencing the large horizontal forces. Many of the propulsive muscles 
are hip retractors (Extended Data Table 1), which are not configured to 
resist forward motion. 
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Extended Data Figure 6 | Tangential and centripetal acceleration the predator-prey pairs. c, Cheetah-impala. f, Lion-zebra. g, Comparison 
analysed for each species. a, Cheetah. b, Impala. d, Lion. e, Zebra. Dots of the predator species (lion—cheetah). h, Comparison of the herbivore 
indicate the data points and the line marks the 98th percentile for data species (impala-zebra). i, The 98th percentile for all four species. 

in speed bins as shown in Fig. 3a, b, d, e. Markers are colour-coded by c, f-i, Data are colour-coded by species (key is shown in i). 

individual, the solid black line is the 98th percentile. c, f, Comparison of 
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Extended Data Figure 7 | Locomotor performance based on stride 
parameters. This is the same as Fig. 3, but the ratios compare the two 
predators and the two prey species. All values are averaged per stride or 
represent the change over a stride. a—e, Acceleration. a, Positive net work 
performed in each stride. b, Stride frequency. c, Stride power. 

d, Increase in speed per stride. e, Forward acceleration with the 

curved lines representing a mean power of 30, 60, 90, 120 and 150 W. 

f-j, Deceleration. f-j, As a-e but for decelerating strides. k-n, Turning. 
k, Centripetal acceleration. 1, The relationship between speed and turn 


p polar coordinate 
u friction coefficient 


Horizontal speed (ms) 


radius with limit lines for a coefficient of friction (j1) of 0.6 and 1.3. 

m, Change in heading versus speed. n, Lateral versus tangential 
acceleration with limits as for jv. In n, F represents pure forward 
acceleration; B, deceleration; and C, acceleration to the side. In each panel, 
there is one line per species that represents the 98th percentile for data 

in speed bins (bins always include 400 data points, therefore bin width 
varies). Cheetah, blue; impala, red; lion, purple; zebra, yellow. Bottom, 

the ratio of that parameter for cheetah to lion (green circle) and impala to 
zebra (magenta circle) is given for each speed bin. 
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Extended Data Figure 8 | Output of model of predator-prey interaction 
and impact of performance differential on hunt outcome. See also Fig. 4. 
a, b, Plot showing output of simulation. a and b are equivalent to Fig. 4d 
with more subplots for cheetah-impala and lion-zebra, respectively. At the 
start of simulation, both have initial velocity towards the top of the page 
and initial separation. After one stride the prey can move to anywhere 

in the red or yellow ellipse by acceleration in the appropriate direction. 
Predator velocity remains unchanged, as there is no prey acceleration in 
the previous stride to react to. Initial positions are shown. Larger red or 
yellow ellipse perimeter is the area prey can reach after two strides of the 
chosen maximum acceleration. The blue or purple filled ellipse represents 
the locations the predator can occupy after its second stride (responding 

to the prey acceleration observed in first stride). The area of the prey 
ellipse that is covered by the predator ellipse line is defined as probability 
of capture. Predator is given a starting speed for each combination of prey 
speed and initial spacing that maximizes the capture probability. Rows are 
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different initial prey speeds, values in red to the left of each row. Columns 
are different initial predator-prey separations at the start of the simulation 
with values given in red below each column. Scale for all instances is given 
in the bottom left plot in metres (in black). The inset black numbers in 
each sub-panel are the initial (optimized for maximum success) predator 
speeds in ms_!. c, The optimum lion speed to maximize overlap (hotter 
colours indicate faster speed, key on the right) as a function of zebra speed 
(x axis) and starting separation (y axis). The histogram above the main 
plot shows the distribution of actual zebra speed at first turn of 10 degrees 
or more for each run (same x axis as the main plot) and the vertical 
histogram shows distribution of actual lion speed at first turn (scale as for 
heat bar). d, The proportional overlap (capture probability), as a function 
of zebra initial speed and starting separation. e, Modelled capacity for 
forward acceleration (speed increase per stride) as a function of speed 
(Extended Data Table 2b). Cheetah, blue; impala, red; lion, purple; zebra, 
yellow. 
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Extended Data Table 1 | Proportion of animal that is locomotor muscle 
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Animal Front le Spine/back theta 
Species n Provenance Hind leg muscle 9 P locomotor 
mass (kg) muscle muscle 
muscle 

51,52 7 front R 9, 9 = _ 
Greyhound 6 hind R 31.8 18.80% 18.40% 
Greyhound™’ 9 R 27.3 -- -- 9.70% -- 
Cheetah™® 8 Z 33.1 18.30% 14.20% -- -- 
Impala 5 w 49.7 17.50% 11.30% = 
Horse’ 7 D 510 19.40% --- -- --- 
Horse” 6 D 503 -- 6.60% - - 
be 6 D 383 10.20%" = _ zs 
Quarterhorse™ D 457 12.70%" = = oe 
Hare®®° 8 w 3.45 16.30% 9.30% 8.90% 34.50% 
Ostrich®' 11 F 105 33.70% - -- -- 
Ostrich®? 2 F 100 30.29% -- -- --- 
Lion® 1 Z 133 12.48% 14.30% -- -- 
Beef Cattle® -—- F 544.3 28.00% 8.60% - -- 
ot al - D 3.25 14.00% es = ae 
ee 1 a 23 10.40% _ = = 


Data are taken from published values for athletic species and number of animals dissected given when reported®!®’. Provenance: wild (W), racing/competition (R), zoo (Z), farmed (F), domestic (D). 
Complete datasets, including all muscles and the animal mass, are sparse and many of the ones summarized are from our own group. In some of the studies, the animals would be sedentary or have 
died from other causes, so it is likely that wild animals have more muscle®. It was previously reported® that skeletal muscle as a fraction of body mass is 53% for thoroughbred racehorses and 44% for 
other horses. All values are for muscle from both limbs as a percentage of total body mass. 


*Only 11 muscles reported. 
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Extended Data Table 2 | Performance parameters 


Performance parameter Predator Herbivore 
Pos. tangential acceleration (ms”) -0.5 -0.21 
Neg. tangential acceleration (ms”) -0.42 -0.28 
Pos. work (Jkg*BM) -0.17 -0.45 
Neg. work (Jkg*BM”) 0.16 0.48 
Pos. power (Wkg BM") -0.47 -0.41 
‘4 Neg. power (Wkg BM") 0.5 0.55 
Species Speed Centrip. | Pos. tang. Neg. tang. Pos. work Neg. work Pos. power Neg. power 
(ms_) accel (ms ~) accel (ms~) accel (ms~) (Jkg BM) (Jkg BM) (Wkg BM) (Wkg BM) 
Cheetah 19.9 12.3 8.3 -10.6 33.3 -39 109 -114 
Impala 13.8 10.9 5.7 -6.3 22.1 -24 59 -60 
Lion 13.9 6.5 5.2 7 20.5 -24 48 -56 
b Zebra 10.6 5.6 3.9 -3.8 9.9 -9 24 -22 


a, log-log slope of performance parameter versus mass for the two herbivore species and the two predator species. Stride values were extracted to represent species’ performance and evaluated 
versus body mass to explore whether the performance difference between small and large was concomitant with effects reported across a broad animal size range®. Parameters with increasing 
values (positive and negative work and power) were represented by maximum values whereas for parameters that plateaued (positive and negative tangential acceleration, centripetal acceleration), an 
average value was calculated. The slope of the logarithmic coordinates (log-log slope of performance parameter versus mass) was calculated for the two predators and two herbivores. The relationship 
is generally consistent in predators and in prey, with most parameters dropping with increasing size, but this does not provide an explanation for the magnitude of the differences seen, as most 
parameters would scale weakly with animal size. b, Maximum (98%) values for stride parameters for all species. Maximum values were determined using the 98th percentile (after species-specific 
steady-state strides were removed from the data, positive and negative data were calculated separately). These are the parameters used in the model. 
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Somatic mutations of ERBB2 and ERBB3 (which encode HER2 and HER3, respectively) are found in a wide range of 
cancers. Preclinical modelling suggests that a subset of these mutations lead to constitutive HER2 activation, but most 
remain biologically uncharacterized. Here we define the biological and therapeutic importance of known oncogenic 
HER2 and HER3 mutations and variants of unknown biological importance by conducting a multi-histology, genomically 
selected, ‘basket’ trial using the pan-HER kinase inhibitor neratinib (SUMMIT; clinicaltrials.gov identifier NCT01953926). 
Efficacy in HER2- mutant cancers varied as a function of both tumour type and mutant allele to a degree not predicted by 
preclinical models, with the greatest activity seen in breast, cervical and biliary cancers and with tumours that contain 
kinase domain missense mutations. This study demonstrates how a molecularly driven clinical trial can be used to refine 
our biological understanding of both characterized and new genomic alterations with potential broad applicability for 


advancing the paradigm of genome-driven oncology. 


Genomic profiling of human cancers has identified recurrent somatic 
mutations of HER2 (encoded by ERBB2) and HER3 (ERBB3), typically 
occurring in the absence of gene amplification’ *. Mutations in HER2 
are clustered in the extracellular, transmembrane and kinase domains. 
Unlike other mutant oncogenes, such as BRAF or KRAS, no single 
mutant allele predominates and the precise distribution of mutations 
varies by tumour type’. By contrast, HER3 mutations cluster primarily 
in the extracellular domain and to a lesser extent in the kinase domain. 
Although HER2 and HER3 mutations are found in a wide variety of 
cancers, their overall prevalence does not exceed 10% in any individual 
tumour type, and the rate is more typically less than 5% for HER2 and 
less than 1% for HER3. 

Biological modelling has yielded conflicting findings as to the func- 
tional consequences of HER2 and HER3 mutations. Substantial data 
suggest that a subset of these mutations induce ligand-independent 
constitutive HER2 receptor signalling and promote oncogenesis* ’. 
The mechanism of these oncogenic effects seems to differ by variant, 
with some causing enhanced HER2 kinase activity and others causing 
receptor dimerization®*. Mutations in HER3, which in its wild-type 
configuration has impaired kinase function, seem to rely on wild-type 
HER? to exert its oncogenic effects’. Most preclinical data that explore 
the functional consequences of HER2 and HER3 mutations have been 
generated using engineered models that overexpress the mutation, and 
thus the results may be confounded by the known oncogenic effects 
of HER2 overexpression. Further enforcing the potential importance 
of this confounding variable, models of HER2 mutation generated by 
gene-editing techniques have failed to demonstrate a malignant pheno- 
type in the absence of mutations in other oncogenes such as PIK3CA?. 


Given the considerable diversity of HER2 and HER3 mutations, as 
well as the challenge of generating preclinical models that recreate their 
true biology in human cancers, we sought to define the therapeutic 
importance of HER2 and HER3 mutations by conducting SUMMIT—a 
global, multicentre, multi-histology basket trial in patients with 
tumours that contain these mutations (Extended Data Fig. 1). Patients 
were treated with neratinib, an irreversible pan-HER tyrosine kinase 
inhibitor, which potently inhibits the growth of HER2-mutant tumours 
in preclinical models®. Tumour tissue and plasma were collected to 
facilitate the detailed genomic characterization of patients. Here we 
present the results of this study, with a focus on the insights it provides 
into the biological and therapeutic importance of HER2 and HER3 
mutations in patients with cancer. 


Patient and mutation characteristics 
Baseline patient demographics are shown in Table 1 and Extended Data 
Table 1. In total, 141 patients (125 with HER2-mutant tumours, 16 with 
HER3-mutant tumours) received neratinib treatment. These patients 
were diagnosed with 1 out of 21 unique cancer types, the most com- 
mon being breast, lung, bladder and colorectal cancer (61% of patients 
treated). As has been seen in other basket studies!°", we identified and 
enrolled several orphan tumour types including cancers of the biliary 
tract, salivary gland, small bowel and vagina, as well as extramammary 
Paget's disease (in aggregate, 13% of all patients). Patients tended to be 
heavily pretreated with approximately half having received at least three 
previous lines of systemic therapy. 

Enrolled patients had 31 unique HER2 and 11 unique HER3 muta- 
tions (Extended Data Fig. 2). The most frequent HER2 mutations 
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Table 1 | Patient demographics 


Patient HER2 mutant HER3 mutant Total 
characteristic (n=125) (n=16) (n=141) 
Age, years 

Median (range) 61 (30-83) 66 (39-82) 61 (30-83) 
<65 years, n (%) 81 (64.8) 7 (43.8) 88 (62.4) 
>65 years, n (%) 44 (35.2) 9 (56.3) 53 (37.6) 
Sex, n(%) 

Female 80 (64.0) 2 (75.0) 92 (65.2) 
Male 45 (36.0) 4 (25.0) 49 (34.8) 
ECOG performance status, n (%) 

(e) 37 (29.6) (6.3) 38 (27.0) 
ul 83 (66.4) 2 (75.0) 95 (67.4) 
2 5 (4.0) 3 (18.8) 8 (5.7) 
Previous systemic treatment lines, n (%) 

Any 121 (96.8) 6 (100) 137 (97.2) 
1 33 (26.4) (6.3) 34 (24.1) 
2 30 (24.0) 1 (68.8) 41 (29.1) 
>3 58 (46.4) 4 (25.0) €2 (44.0) 
Median time from 1.02 (0.0-15.0) 1.13 (0.3-4.5) 1.03 (0.0-15.0) 
metastasis to enrolment, 

years (range) 

Tumour type, n (%) 

Lung 26 (20.8) 0 (0) 26 (18.4) 
Breast 25 (20.0) 0 (0) 25 (17.7) 
Bladder 16 (12.8) 2 (12.5) 18 (12.8) 
Colorectal 12 (9.6) 5 (31.3) 17 (12.1) 
Biliary tract 9 (7.2) 2 (12.5) 11 (7.8) 
Endometrial 7 (5.6) 1 (6.3) 8 (5.7) 
Cervical 5 (4.0) 0 (0) 5 (3.5) 
Gastroesophageal 5 (4.0) 2 (12.5) 7 (5.0) 
Ovarian 4 (3.2) 1 (6.3) 5.65) 
Other 16 (12.8) 3 (18.8) 19 (13.5) 


were S310, L755, Y772_A775dup and V777 alleles. The HER2 kinase 
domain was most commonly mutated (66%), followed by the extra- 
cellular (26%) and transmembrane/juxtamembrane (8%) domains. 
The anticipated relationships between the mutated HER2 domain 
and tumour type were observed, with extracellular domain mutations 
predominant in bladder cancer, kinase domain missense mutations in 
breast and colon cancer, and kinase domain insertions in lung cancer’, 
Missense mutations were the most common class of genomic altera- 
tion (74%), followed by in-frame insertions (22%), the latter exclusively 
affecting the kinase domain. Two tumours contained HER2 insertions/ 
deletions and one an in-frame kinase domain-retaining fusion (GRB7- 
ERBB2)'*'?, HER3 mutations were all missense variants and clustered 
in the extracellular furin-like and receptor domains. In total, 87% (109 
out of 125) of HER2 and 75% (12 out of 16) of HER3 mutations were 
at positions now known to be mutational hotspots*. This pattern of 
HER2 and HER3 mutations was comparable to the spectrum of non- 
truncating HER2 and HER3 mutations observed in previously pub- 
lished genomic landscape studies, including The Cancer Genome Atlas 
(TCGA) and the International Cancer Genome Consortium (ICGC)4, 
although HER2 V777L and Y772_A755dup were more common in our 
study cohort (13.6% versus 5.3% and 12.0% versus 2.7%, respectively; 
Extended Data Fig. 3). 


Treatment outcomes 

When stratified by tumour type, we observed responses to neratinib in 
patients with HER2-mutant breast, non-small-cell lung, cervical, biliary 
and salivary cancers, which led to expanded enrolment in several 
of these tumour types (Fig. la, Extended Data Table 1). Neratinib 
exhibited the greatest degree of activity in patients with breast cancer 
(n=25 total, objective response rate at week 8 (ORRg) 32%, 95% con- 
fidence interval 15-54%), with responses observed in patients with 
missense mutations involving the extracellular and kinase domains, as 
well as insertions in the kinase domain. All patients with breast cancer 
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were classified as HER2-negative (non-amplified) at the time of enrol- 
ment as per established guidelines'*. Responses were observed in both 
oestrogen receptor-positive (30%, 6 out of 20) and -negative (40%, 2 out 
of 5) tumours. Overall, these breast cancer data are generally consistent 
with a previous report’®. In patients with lung cancer (n =26), in which 
insertions in exon 20 predominate, we observed only one objective 
response. Of note, HER2 exon 20 insertions are paralogous of EGFR 
exon 20 insertions, which are resistant to first- and second-generation 
EGER tyrosine kinase inhibitors’®. Notably, the only patient with 
lung cancer to achieve a response evaluation criteria in solid tumours 
(RECIST) response had a kinase domain missense mutation (L755S). 
Despite the low response rate, the median progression-free survival in 
recurrent lung cancer was 5.5 months, with 6 patients remaining on 
therapy for more than 1 year, which compares favourably to second-line 
chemotherapy and immune checkpoint inhibitors!’, suggesting that 
neratinib may have a positive effect on the natural history of this 
disease. Responses were also observed in biliary and cervical cancers, 
and enrolment is ongoing in these cohorts to define this activity better. 
No responses were observed in bladder cancer (n= 16) or colorectal 
cancer (n = 12), suggesting lineage-dependent resistance to single-agent 
pan-HER kinase inhibition in these tumour types. In summary, among 
the HER2-mutant cohorts, breast cancer met the primary endpoint for 
efficacy, whereas lung, colorectal and bladder cancers did not. For the 
remaining tumour-specific cohorts, enrolment is continuing and they 
have therefore not undergone final efficacy analysis. Despite preclinical 
data to suggest that HER3 mutations can be oncogenic drivers, no 
responses to neratinib were observed in patients with HER3-mutant 
tumours. 

When stratified by mutant allele, responses were observed in patients 
with tumours containing HER2 $310, L755, V777, G778_P780dup 
and Y772_A775dup mutations (Fig. 1b). Among patients with HER2 
kinase domain hotspot missense mutations (n = 42), responses were 
noted in four unique tumour types (breast, biliary, lung and salivary 
gland). By allele, we observed responses in several kinase domain 
mutants including L755S (n=4), V777L (n=4) and L869R (n= 1). 
In patients with HER2 hotspot extracellular domain mutations (S310, 
n= 30), responses were observed in breast, cervical and biliary cancers 
(n= 1 for each), but not in bladder cancer, the cancer type in which 
these mutations predominate. Similarly, in patients with HER2 exon 
20 insertions (1 = 28), responses were observed in two patients with 
breast cancer, but none were seen in patients with lung cancer, in which 
this class of alteration is most common. In exon 20 insertions, preser- 
vation of glycine at the 770 position, which seems to facilitate binding 
of covalent HER kinase inhibitors such as neratinib, did not predict for 
response as previously suggested by preclinical modelling!* (Extended 
Data Fig. 4). Similarly, the number of amino acids involved in the 
insertion did not seem to predict outcome, with responses observed 
in patients with both 3 (G788_P780dup) and 4 (Y722_A755dup) amino 
acid insertions. Finally, among the 15 patients with HER2 mutations 
not known to be hotspots, only one responded to neratinib. Notably, 
this response occurred in a patient with breast cancer and a complex 
insertion/substitution (L755_E757delinsS), which, to our knowledge, 
has not been observed previously. Although this case illustrates that the 
tumours of some patients may be addicted to truly private oncogenic 
drivers (those arising in only a single patient), it is also noteworthy that 
this insertion occurs in a domain that is the target of recurrent inser- 
tions. The absence of clinical activity in the remaining 14 patients with 
cancers with non-hotspot mutations suggests that, although the recur- 
rence of a mutation in HER2 is insufficient to define it as sensitizing to 
a HER2 kinase inhibitor, the absence of recurrence (that is, mutations 
that do not occur at hotspot positions) provides circumstantial evidence 
that the alteration is unlikely to be a driver. 

Although the overall numbers of patients in each subgroup preclude 
formal statistical comparison, integrating efficacy, mutational and 
lineage data, we observed that clinical benefit from neratinib therapy 
appeared to vary as a function of both mutational and disease context 
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Figure 1 | Individual treatment 
outcome and response for 141 
patients grouped by tumour 
cohort and mutant allele/ 
domain. a, b, Top, percentage 

best change from baseline in 

the target lesion assessed by the 
appropriate response criteria 
(RECIST version 1.1 or PET). Each 
bar is colour coded according to 

its mutation allele/domain, for 
patients grouped by tumour cohort 
(a), or tumour type, for patients 
grouped by mutant allele/domain 
(b). Middle, best overall response. 
Bottom, progression-free survival 
(PFS), colour-coded by treatment 
status. *Non-evaluable. Cerv, 
cervical; endo, endometrial; gastro, 
gastroesophageal; ov, ovarian; PET, 
positron-emission tomography. 
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(Fig. 2). In tumour types sensitive to neratinib therapy, such as breast, 
biliary and cervical cancers, responses were collectively observed across 
all types and classes of HER2 mutations. By contrast, in lung cancer, a 
tumour type that exhibits modest sensitivity to neratinib, response was 
limited to a patient with a HER2 kinase domain missense mutation—a 
class of mutation with greater in vitro sensitivity to neratinib®. Finally, 
in tumour types with intrinsic lineage-based resistance to neratinib, 
such as bladder and colorectal cancers, responses were not observed 
regardless of the HER2 mutation, type or class. 


Safety 

All patients received neratinib with mandatory anti-diarrhoeal 
prophylaxis. With this regimen, the rate of grade 3 diarrhoea was 22% 
(Extended Data Table 2), consistent with previous experience’. Among 
patients who developed grade 3 diarrhoea, the median time to onset 
was 10 days and the median duration of the diarrhoea episode was 
2 days. Patients were typically managed with dose interruption and 
reduction, with only 2.8% permanently discontinuing therapy owing 
to diarrhoea. The remainder of adverse events were predominantly 
low-grade. 


Central confirmation of HER2 and HER3 mutations 

There is active debate within the cancer research community as to 
whether central confirmation of mutational status before study entry 
is optimal for determining trial eligibility for precision medicine 
studies. To define the reproducibility of local mutational testing, DNA 
from archival formalin-fixed paraffin-embedded tumour and plasma 


samples were re-sequenced (see Methods). A total of 33 patients 
(26 HER2-mutant, 7 HER3-mutant) were excluded from this 
concordance analysis because the local test used was the same as the 
central tumour assay being evaluated. Of the remaining 99 patients with 
HER2 mutations, adequate material for tumour genomic testing was 
unobtainable for 26 patients. Overall, concordance in the remaining 
patients based on central tumour and/or plasma sequencing was 95% 
(69 out of 73), with 38 patients assessed by tissue and plasma, 14 by tis- 
sue alone, and 21 by plasma alone. Central testing identified one locally 
reported mutation (V773M) as a germline polymorphism and this 
patient, with renal cell carcinoma, had progressive disease at first scan. 
Central testing in the four cases in which the HER2 mutation could not 
be confirmed passed all quality-control metrics, but in two patients the 
testing was performed on material collected at least three years after 
the tissue used for local testing, raising the possibility that tumour het- 
erogeneity was involved in the discordance. None of the patients with 
discordant HER2 results responded to neratinib, and their median 
progression-free survival was only 43 days (range: 5-58 days). Among 
the 9 patients eligible for concordance testing with HER3 mutations, 
tumour tissue was available for central sequencing in 8 patients, and 
overall concordance was 75% (6 out of 8). 


Genomic modifiers of response 

Given the variability of treatment response, even among patients with 
the same tumour lineage and HER2-mutant allele, we sought to iden- 
tify other genomic modifiers of response through broader genomic 
characterization of tumour-derived DNA (see Methods). First, we 
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Figure 2 | Integrated efficacy by tumour type and HER2 allele/domain. 
The y axis represents the tumour types, and the x axis represents the 
mutated allele/domain and hotspot status. The hotspot mutations are 
further broken down into the various domains. The size of the circle is 
proportional to the count of the tumour type and allele/domain; the 


explored the relationship between ERBB2 amplification and outcome, 
as this is a well-established predictor of response to HER2-targeted 
therapies in patients lacking HER2 mutations. In total, 17% of patients 
(15 out of 86) had concurrent HER2 mutations and gene amplification. 
Amplifications preferentially targeted the mutant allele locus (86%, 12 
out of 14 evaluable). Using a dichotomous definition of clinical benefit 
(stable disease or partial response lasting at least 24 weeks), ERBB2 
amplification did not correlate with outcome (P= 0.50; Fig. 3), sug- 
gesting that in the presence of HER2 mutations, amplification may not 
confer additional sensitivity to irreversible HER kinase inhibitors. 
We also explored the relationship of ERBB2 mutation clonality on 
outcomes. In the 74 patients with adequate material to allow definitive 
assessment of ERBB2 mutant clonality, the mutation was clonal in 95% 
(70 out of 74; Extended Data Fig. 5a). None of four patients with a 
subclonal ERBB2 mutation achieved clinical benefit. 

Hypothesizing that tumours with an increased tumour mutational 
burden (TMB) might be more likely to acquire HER2 mutations with- 
out developing oncogenic dependence (that is, passenger mutations), 
we evaluated whether overall TMB status affected outcome. Using a 
previously validated cut-off (>13.8 non-synonymous mutations per 
megabase of DNA’), 20% of patients (17 out of 86) met criteria for 
a high TMB. In total, 24% of patients (16 out of 66) without clinical 
benefit versus 5% of patients (1 out of 20) with benefit met criteria for 
a high TMB, a trend that did not reach statistical significance (P=0.10). 

Next, we evaluated whether the pattern of co-mutations affected 
clinical benefit in the subset of patients where broader profiling was 
available (n = 86). In patients with HER2-mutant disease, coinci- 
dent mutations in TP53 and HER3 were enriched in patients with no 
clinical benefit (nominal P=0.018 and P=0.064, respectively; Fig. 3). 
Although not significant after correcting for multiple hypothesis 
testing, potentially owing to the relatively small sample size, it is note- 
worthy that no patients with clinical benefit possessed co-mutation of 
HER2 and HER3. Concurrent mutation of these genes was observed 
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colour of the circle reflects the median percentage best change in the 
target lesions (any zero or positive median change is indicated in white). 
The stacked bars represent the best overall response for the tumour type 
or domain/allele, as indicated in the key. ECD, extracellular domain; ICD, 
intracellular domain; TMD, transmembrane domain. 


in multiple cancer types (breast n = 3, bladder n = 2, gastroesophageal 
n=2, colorectal n=1 and pancreatic n= 1) and involved a variety of 
unique HER2 and HER3 mutations (n =8 and n=9, respectively). 
Expanding our analysis to genomic activation at the pathway level, 
we identified somatic mutations of known oncogenic potential 
and grouped them by those involving the receptor tyrosine kinase 
(RTK)/RAS/RAF and PIK3CA/AKT/mTOR pathways, and cell cycle 
checkpoints (Extended Data Fig. 5b). In this analysis, concurrent aber- 
rations in cell cycle checkpoints were associated with lack of clinical ben- 
efit (P=0.043), and activation of RTK/RAS/RAF also trended towards 
a worse outcome (P= 0.060). The association between the cell-cycle 
pathway and lack of clinical benefit seems to be primarily driven by 
TP53 mutations, losing significance upon removal of TP53 mutations 
(P= 0.769). Interestingly, activation of the PIS3K/AKT/mTOR path- 
way, an established negative predictor of response to HER2-targeted 
therapy in HER2-amplified breast cancer””-**, did not adversely affect 
the likelihood of clinical benefit (P= 0.753). It is possible that the clinical 
impact of concurrent gene/pathway activation may vary by tumour type, 
and future disease-specific studies are needed to define these associa- 
tions better. Although these were exploratory analyses that will require 
confirmation, our results suggest that concurrent activation of specific 
genes as well as pathways may act as an additional modifier of response 
beyond cancer type and specific HER2 mutant allele. 


Discussion 

The ability to profile cancer comprehensively at the point of care has 
made possible the opportunity to personalize therapy for each patient 
based on the compendium of genomic alterations identified”. Despite 
the promise of this approach, implementing this paradigm in clinical 
practice has been hampered by considerable gaps in knowledge about 
the biological and clinical importance of most genomic variants 
identified”. This challenge is exemplified by the marked diversity and 
wide distribution of HER2 and HER3 mutations in human cancers, 
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Figure 3 | Genomic modifiers of response and outcome by treatment 
duration. Comprehensive OncoPrint of the dichotomous clinical 

benefit groups for 86 patients with broad profiling data (left: no benefit 
(n= 66, biologically independent samples), right: clinical benefit (n = 20, 
biologically independent samples)). From top to bottom: TMB with the 
dotted line indicating the threshold for high TMB at 13.8 mutations (mut) 


as well as by the difficulty of generating preclinical models of these 
mutations that correctly recreate their biology in patients. To our 
knowledge, SUMMIT provides the first comprehensive dataset on the 
clinical actionability of HER2 and HER3 mutations. We found that 
HER2 mutations are associated with HER2-dependence in a subset 
of patients with HER2-mutant tumours, but that response to HER 
kinase inhibition varies a function of the individual mutant variant, 
the tumour type as well as the pattern of co-mutations present. 

Although we identified promising preliminary activity for neratinib 
in breast, biliary and cervical cancers, the response rate in these tumours 
was still lower than with approved therapies that target oncogenic 
alterations in EGFR, ALK, ROS1 and BRAF. The low response rate in 
lung cancer, in which HER2 mutations exhibit mutually exclusivity 
with other known drivers”, is also notable and may in part reflect a 
lower potency of neratinib inhibition in Y772_A775dup compared to 
other insertions or missense mutants’*. Successfully targeting HER2 
activation in other contexts has historically necessitated drug com- 
binations. For example, single-agent trastuzumab has a response rate 
of only approximately 20% in ERBB2-amplified breast cancer”®”’. By 
contrast, the overall survival in ERBB2-amplified breast and gastro- 
esophageal cancers is markedly improved by adding trastuzumab to 
chemotherapy**”’. More recently, the intensification of HER2 inhibi- 
tion through the combination of two HER2-targeted agents has been 
shown to result in synergistic efficacy in patients with ERBB2-amplified 
breast®”-”” or colorectal**** cancers, as well as in HER2-mutant colorec- 
tal cancer xenografts®. Cumulatively, these data suggest that combining 
neratinib with another HER2-targeted therapy is a rational next step, 
and SUMMIT has been amended to evaluate this approach in multiple 
HER2-mutant tumour types. 

SUMMIT represents a continued evolution in the design of basket 
studies, which enrol patients on the basis of qualifying mutations rather 
than tumour type. The initial generation of these studies focused on 
evaluating individual somatic mutations that were already clinically 
validated in one cancer (such as BRAF V600 in melanoma) in other 
tumour types!%%°. More recently, basket studies have been used to 
generate initial or even practice-changing clinical data of truly novel 
genomic biomarkers, especially when these genomic alterations occur 
at low frequency across a wide distribution of cancer types!!*°°”, 
SUMMIT extends this concept one step further by demonstrating for 
the first time how a single study can be used to simultaneously evaluate 


per megabase; microsatellite (MSI) status; allele/domain; tumour type; 
HER2 (ERBB2) status showing amplification; clonality and the presence 
of a single or multiple mutations; and co-alterations in genes associated 
with key pathways. *P=0.064, **P =0.018, Fisher's exact test. Statistical 
significance is lost when corrected for multiple hypothesis testing. 


a range of individual variants in HER2 and HER3, each with varying 
degrees of prior biologic characterization. This permissive enrolment 
strategy allowed us to treat patients harbouring mutations that, at the 
time of enrolment, had not been characterized preclinically as gain-of- 
function but were either recurrent or paralogous to known activating 
mutations in homologous genes. For example, patients with previously 
uncharacterized HER2 variants, such as V697L, D769N/H/Y and 
L869R, were included in this manner and responded to treatment, 
thus providing initial clinical proof-of-concept that these mutations 
confer a gain-of-function phenotype even before formal biologic 
characterization. The approach of pairing a permissive enrolment 
strategy with allele prioritization based on recurrence, paralogy and 
other readily computable features has potentially broad applicability 
to implementing genomic-driven oncology”. This strategy will take 
on even greater importance as clinical testing moves from targeted 
sequencing to whole exome or even whole genome sequencing, tech- 
niques that will allow for evaluation of an even greater number of thera- 
peutic hypothesis but will also exponentially expand the number of 
uncharacterized alleles we routinely identify. 

SUMMIT provides additional insights into the conduct of mole- 
cularly driven oncology studies. Our ability to understand the com- 
plex interactions between tumour lineage, individual HER2 variant 
and response to neratinib was only possible because of the relatively 
large size of this study (n = 141). By comparison, many of the ‘master/ 
umbrella’ protocols that are currently underway are designed to enrol 
a maximum of 30-40 patients into each genomically defined treatment 
arm. Our experience suggests that many studies of this size may be 
inadequately powered to identify the subgroups with true efficacy, 
assuming that most genomic alterations will not predict for tumour- 
type agnostic efficacy. SUMMIT also demonstrates the feasibility 
of enrolling patients based on local testing, with patients treated on 
the basis of 30 unique sequencing assays performed in 25 different 
laboratories. Despite this, concordance on retrospective central review 
was extremely high (96%). 

An important impediment to progress in oncology has been the 
limited availability of preclinical model systems that accurately recreate 
the complex biology of human cancer. Although important strides have 
been made, the wide-scale profiling of cancer in the clinic provides the 
potentially transformative opportunity to interrogate cancer biology at 
the bedside in a manner previously only possible at the bench. Here, 
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we demonstrate how this opportunity can be leveraged to probe the 
biology of a diverse set of HER2 and HER3 mutations across a variety 
of solid tumours through pharmacological HER kinase inhibition in 
patients. In doing so, we found that response to pharmacological inhibi- 
tion was based on the characteristics of both tumour type and genomic 
variant to a degree that was not predicted by established preclinical 
models. In summary, SUMMIT demonstrates how the clinical trial 
can become an important tool in refining our understanding of the 
biological dependencies in human cancers. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Patients. Eligible patients had histologically confirmed advanced solid tumours 
harbouring HER2 or HER3 mutations, an Eastern Cooperative Oncology Group 
(ECOG) performance score of 0-2 and an unlimited number of previous therapies. 
Patients with previous exposure to HER kinase inhibitors and unstable brain metas- 
tases were excluded. HER2 and HER3 mutations were determined by local tumour 
testing as routinely performed or ordered by each participating site. In total, 85% 
(120 out of 141) of enrolled patients were identified by next-generation sequenc- 
ing assays. In 81% of cases (97 out of 120), the next-generation sequencing assay 
included full exon coverage for ERBB2 or ERBB3, whereas in 19% (23 out of 120) of 
cases, only select exons or hotspots were included in the assay design. The remaining 
15% (21 out of 141) of patients were enrolled via RT-PCR, Sanger, pyrosequencing, 
or mass spectrometry-based sequencing methods. The study was approved by the 
institutional review board or independent ethics committee at each site and com- 
plied with the International Ethical Guidelines for Biomedical Research Involving 
Human Subjects, Good Clinical Practice guidelines, the Declaration of Helsinki, and 
local laws. Written informed consent was obtained from all participants. 

Study design, treatment and endpoints. This was a multi-cohort basket study 
of patients with solid tumours harbouring HER2 and HER3 mutations. Patients 
with HER2-mutant tumours were enrolled into one of several disease-specific 
cohorts or an ‘other’ cohort for tumour types not otherwise specified; all patients 
with HER3-mutant tumours were enrolled to one cohort. Patients known to 
contain both HER2 and HER3 mutations at the time of enrolment were assigned 
to the HER2-mutant cohort. Patients were treated with neratinib 240 mg daily on 
a continuous basis with mandatory loperamide prophylaxis during cycle 1. The 
primary endpoint was ORRg, as assessed by investigators according to RECIST 
(version 1.1). Secondary endpoints included best overall response, progression-free 
survival, overall survival and safety. Patients who were not evaluable by RECIST 
were permitted to enrol and were evaluated for response by '*F-fluorodeoxyglucose 
PET according to a modified version of the original PET Response Criteria in 
Solid Tumours (PERCIST; version 1.0)*°, referred to here as PET Response Criteria 
(PRC, Extended Data Table 3). 

Assessments. Disease assessments with computed tomography, magnetic 
resonance imaging or combined positron emission tomography-computed 
tomography (for those evaluated by PRC) were performed at baseline and then 
every 8 weeks until disease progression, death or withdrawal. Adverse events were 
graded by the investigator according to the Common Terminology Criteria for 
Adverse Events (version 4.0) until day 28 after discontinuation of study treatment. 
Genomic biomarker studies. All samples were assigned anonymized identifiers by 
the study sponsor based on the order of study enrolment. Both tumour DNA and 
tumour-derived cell-free DNA in plasma were collected with the goals of confirming 
locally reported HER2/3 mutations as well as evaluating how ERBB2 and ERBB3 copy 
number and clonality as well as co-mutational pattern affected outcome. Collection of 
archival tumour and plasma samples was mandatory for all patients. Next-generation 
sequencing was performed using targeted sequencing of pretreatment DNA from 
formalin-fixed paraffin-embedded tumour and matched blood specimens (preferen- 
tially) and cell-free DNA (if tumour was not available or was inadequate). A custom 
single-gene ERBB2 capture next-generation sequencing test was also performed on 
pretreatment cell-free DNA in a subset of patients with HER2-mutant disease. 
Central sequencing confirmation. For patients with adequate material, DNA 
from formalin-fixed paraffin-embedded (n= 91) or tumour-derived cell-free DNA 
from plasma (n= 15) and matched germline DNA (n= 102) underwent targeted 
next-generation sequencing assay using Memorial Sloan Kettering-Integrated 
Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT)’, producing an 
average of 738-fold coverage per tumour (range: 253-1,383). In brief, this assay uses 
a hybridization-based exon capture designed to capture all protein-coding exons and 
select introns of oncogenes, tumour-suppressor genes and key members of pathways 
that may be actionable by targeted therapies. In this study, either 341 (n= 18) or 410 
(n= 88) key cancer-associated genes were analysed (Supplementary Information). 
Sequencing data were analysed as previously described to identify somatic single- 
nucleotide variants, small insertions and deletions, copy number alterations and 
structural arrangements”. In addition, hotspot alterations were identified using 
an adaptation of a previously described method’ applied to a cohort of 24,592 
sequenced human cancers“°. For gene-level analysis, select genes within our tar- 
geted 341/410 MSK-IMPACT panel involved in the RTK/RAS/RAF, PIK3CA/AKT/ 
mTOR, and cell cycle checkpoint pathways were selected using the KEGG pathway 
database". For pathway level analysis, only potentially oncogenic alterations in the 
selected genes were included and determined to be oncogenic by OncoKB (version 
September 2017), a curated knowledge base of the oncogenic effects and treatment 
implications of mutations and cancer genes (http://www.oncokb.org”’). 

HER2 amplification and clonality analysis. For patients in the HER2-mutant arm 
with MSK-IMPACT sequencing data (with matched germline DNA, n= 74), the 
Fraction and Allele-Specific Copy Number Estimates from Tumour Sequencing 
(FACETS) algorithm (version 0.3.9) was used to estimate tumour purity and ploidy, 
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and total and allele-specific copy number”. Tumour samples with purity less than 
20% were excluded from this analysis. Focal HER2 amplifications for tumours with 
MSK-IMPACT and FACETS data were inferred using the following criteria: fold 
change > 1.5 (MSK-IMPACT tumour:normal sequencing coverage ratio) and total 
HER2 copy number > 4 copies (FACETS-derived total copy number). To infer clonal- 
ity of each HER2 mutation, cancer cell fractions were estimated with 95% confidence 
intervals by integrating FACETS-derived joint segmentation and MSK-IMPACT 
mutation data as input into the ABSOLUTE algorithm“ (version 1.0.6). Mutations 
were classified as either clonal or subclonal based on the following criteria: clonal 
if the estimated cancer cell fractions > 0.85, otherwise subclonal. For patients with 
HER2 amplification, the mutation copy number (mutation multiplicity) was calcu- 
lated as previously described* to infer amplification of the mutant allele when the 
mutation multiplicity was greater than half of the total HER2 copy number. 

TMB and MSI. TMB, defined as the number of non-synonymous mutations 
per megabase, was calculated for patients with MSK-IMPACT sequencing data 
(n= 106)°. MSI was assessed for patients with HER2-mutant tumours with 
matched germline DNA sequencing data (n = 89) using an orthogonal bioinfor- 
matics tool, MSIsensor*°. Furthermore, mutations were decomposed into the 30 
constituent mutational signatures as described previously*”. In brief, MSIsensor 
scores <10 were classified as microsatellite stable and >10 were considered MSI- 
high using a previously validated cut-off score**. Those with a MSIsensor score 
of <10 but having evidence of a dominant mismatch repair mutational signature 
were also considered MSI***”, 

Statistical analysis. For each HER2-mutant tumour type and the HER3-mutant 
cohort, a Simon optimal two-stage design with a true ORRg < 10% was considered 
unacceptable (null hypothesis), whereas a true ORRg > 30% (alternative hypothesis) 
merited further study. Efficacy in each cohort was analysed independently and the 
study was not designed to compare efficacy across cohorts formally. All patients 
who received at least one dose of neratinib were included in the safety and efficacy 
cohorts. All data reflect an interim data-cut taken on 10 March 2017 from patients 
enrolled up to 16 December 2016 (Extended Data Fig. 6). Most patients were off 
therapy at the time of data analysis (Extended Data Table 4). Progression-free 
survival was estimated using the Kaplan-Meier method. The study is registered at 
http://www.clinicaltrials.gov, under the identifier NCT01953926. Individual associ- 
ations among genomic changes and response were assessed by either Fisher’s exact 
or chi-squared tests (where appropriate) and corrected for multiple hypothesis 
testing using Benjamini-Hochberg correction. 

Chi-squared or Fisher's exact tests were performed to compare gene-level and 
pathway-level associations between the dichotomous clinical benefit groups. P values 
were corrected for multiple hypothesis testing using Benjamini-Hochberg correction. 
HER2 and HER3 lollipop distribution plots were generated using ProteinPaint”. All 
other figures were generated using R software (http://www.R-project.org/). 

This clinical trial was not randomized and investigators were not blinded to 

treatment allocation and outcome assessment. 
Data availability. All datasets generated during and/or analysed during the current 
study, including patient-level clinical data as well as all sequencing data have been 
deposited and are publically available in the cBioPortal for Cancer Genomics under 
the accession code ‘SUMMIT, Nature, 2018’ (http://www.cbioportal.org/study?id= 
summit_2018). 
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HER2 or HER3 mutations 
(documented by local testing) 


Primary endpoint 


* Objective response rate at week 8 (ORRg) 


Secondary endpoints 

* ORR (confirmed) 

* Clinical benefit rate (CBR) 

+ Progression-free survival (PFS) 
+ Safety 

+ Biomarkers 


Simon 2-stage design 


* If 21 response in first evaluable 7 patients, 
expand cohort to Stage 2 (N=18) 


« If 24 responses in Stage 2, expand or breakout 


Tumor assessments 
* RECIST v1.1 (primary criteria) 
« PET response criteria (RECIST non-evaluable) 


Statistical methods 
* ORRg, ORR, CBR: associated 95% Cl 
* Median PFS: Kaplan-Meier estimate with 95% Cl 


Extended Data Figure 1 | Design of SUMMIT study. Five tumour- 
specific HER2 (ERBB2)-mutant cohorts were pre-specified (endometrial, 
gastroesophageal, ovarian, colorectal and bladder/urinary tract). In 
addition, a sixth ‘solid tumour (not otherwise specified, NOS)’ HER2- 
mutant cohort allowed for the enrolment of patients with any other cancer 
types. A sufficient number of patients with breast, cervical, biliary and 


HER2-mutant 
tumors 


HER3-mutant 
tumors 


Endometrial 
- Gastroesophageal a 


ae 
Bladder/urinary tract Neratinib 
=> ie 
- ; 240 mg daily 
Solid tumors (NOS) 
a 


=a tract — 


Solid tumors (NOS) 


lung cancer were enrolled in the solid tumours (NOS) cohort to permit 
independent efficacy analysis using the same design as the pre-specified 
cohorts. Patients with HER3 (ERBB3)-mutant tumours were enrolled in a 
HER3-specific cohort regardless of tumour type. CBR, clinical benefit rate; 
cfDNA, cell-free (tumour) DNA; CI, confidence interval. 
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exon 20 insertions 


@ G776édelinsvc 
V777_G778insG 
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= 100 200 300 400 500 600 700 800 900 1000 1100 1200 
|| Receptor L domain 125 Qualifying ERBB2 Alterations (109 Hotspot) 16 Qualifying ERBB3 Alterations 
| Furin-like domain @ 93 Missense (1 germline) ie) 16 Missense (12 Hotspot) 
| Kinase domain @ 30 Inframe Insertion 
| Growth factor receptor domain IV @ 1 Frameshift 
| Transmembrane domain @ 1 Structural Variant 
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Extended Data Figure 2 | Distribution of HER2 and HER3 mutations with the circle size and number representing the frequency, and coloured 
positioned by their amino acid coordinates across the respective to show the mutation class as indicated in the legend. The corresponding 
protein domains. a, b, HER2 (a) and HER3 (b) mutations (125 and 16 amino acid change and common hotspot mutations (shown in pink) are 
mutations, respectively). Each unique mutation is represented by a circle, labelled next to the circles. 
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ERBB2 


ERBB3 


Extended Data Figure 3 | Spectrum of HER2 and HER3 mutations 
observed in the neratinib study versus TCGA, ICGC and other public 
datasets. a, b, Distribution of HER2 (a) and HER3 (b) mutations observed 
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Neratinib Cohort 
e@ 93 Missense (1 germline) 
@ 30 Inframe Insertion 


@ 1 Frameshift 
@ 1 Structural Variant 


] Receptor L domain 

i Furin-like domain 

a Kinase domain 

| Growth factor receptor domain IV 
Oo Transmembrane domain 


Public datasets 


 } 574 Missense 
@ 60 Inframe Indel 


Neratinib Cohort 
@ 16 Missense 


Oo Receptor L domain 

| | Furin-like domain 

J Kinase domain 

o Growth factor receptor domain IV 
i Transmembrane domain 


Public datasets 


e 600 Missense 
e 9 Inframe Indel 


across our cohort in comparison to the spectrum of HER2 and HER3 
mutations (reflected lollipop) from publically available datasets (TCGA, 


Tumor type 
ial Breast 
Lung 
!) Bladder 
Ovarian 


© Other 


Response criteria 

@ RECIST 1.1 

@ PET response criteria 
Non-Evaluable 


* No target lesion measurement 


| Patients with 0% 
as best change 


Best Response 

i Complete Response 
1 Partial Response 

@ Stable Disease 

li Progressive Disease 
™ Non-Evaluable 


Treatment 
li Ongoing 
Of 


Extended Data Figure 4 | Distribution and outcome of 28 HER2 exon 
20 insertions. a, Percentage best change and PFS plots corresponding 
to each type of exon 20 insertion (colour coded by synonymous amino 
acid change). Three cases with no change are indicated in colour-coded 


Exon 20 Insertions 


250; 

Oo me 

2 of lhn.. 

o 

i 

5 

a 

wn 

o 

a 

2 

& -50 

2 

oO 

a 

-100! 
a 
2 2 an © 

ize g & see 

B 212 

2 10 

2 

a 

Wwe 

oa 


0) 


ARTICLE 


om 


san 
ERBB2 
exon 20 
; 770 772 774 776 778 780 
c Exon 20 Insertions Group N EAYVMAGVGSP 
E770_A771insAYVM EAYVMAYVMAGVGSP 
A771_Y772insYVMA 17 EAYVMAYVMAGVGSP 
Y772_A775dup EAYVMAYVMAGVGSP 
G776delinsVC ee] 1 EAYVMAGVCVGSP 
G776_V777insVGC (es) 3 EAYVMAGVGCVGSP 
V777_G778insG esa 1 EAYVMAGVGGSP 
V777_G778insGSP EAYVMAGVGSPGSP 
G778_P780dup 6 EAYVMAGVGS G 
EAYVMAGVGSPG 


G780_Y781insGSP 


© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


circles above the x axis. b, Zoomed-in schematic of all exon 20 insertions 
positioned by their amino acid coordinates and frequencies. c, Five unique 
types of exon 20 insertions observed in the study with the resulting full 
amino acid sequences (insertion indicated in red). 
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" “a Hi No Benefit (< 24 weeks) 
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Extended Data Figure 5 | Genomic modifiers of response and outcome therapy). b, Comparison of the percentage activation of known oncogenic 
by treatment duration. a, Cancer cell fractions with 95% confidence alterations in the three pathways between the patients of clinical benefit 
intervals and clonality status of all HER2 mutations in 74 patients with (n= 20, biologically independent samples) and no benefit (n = 66, 


sufficient sequencing data ordered by increasing clinical benefit (weeks on _ biologically independent samples). Nominal Fisher’s P values are shown. 
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Assessed for eligibility (n=175) 


Excluded (n=32) 

¢ Not meeting eligibility 
criteria (n=31) 

e Unknown (n=1) 


Enrolled (n=143) 
Received neratinib (n=141) 


Analyzed (n=141) 


Discontinued neratinib (n=131) 
Disease progression (n=104) 
Clinical deterioration (n=8) 
Adverse event (n=5) 
Investigator request (n=5) 
Withdrawal of consent (n=5) 
Death (n=3) 
Lost to follow-up (n=1) 


Extended Data Figure 6 | SUMMIT CONSORT diagram. 
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Extended Data Table 1 | Patient demographics and efficacy by cohort 


Characteristic 


Median (range), years 
<65 years, n (%) 
265 years, n (%) 


Sex, n (%) 
Female 
Male 


ECOG PS, n (%) 
0 


1 
2 


Prior systemic lines, n 
(%) 
None 


2 
23 


Median time from 


metastasis to enrolment, 


years (range) 


Outcome 


ORR at week 8, n (%) 
[95% Cl] 


ORR, n (%) 
[95% Cl] 


Clinical benefit rate, n 
%) 
[95% Cll] 


Median PFS, months 


Breast 
(n=25) 


57.0 (37-80) 


19 (76.0) 
6 (24.0) 


24 (96.0) 
1 (4.0) 


7 (28.0) 
17 (68.0) 
1 (4.0) 


0 (0) 
3 (12.0) 
2 (8.0) 
20 (80.0) 


2.64 
(0.1-1.0) 


Breast 
(n=25) 


8 (32.0) 
[14.9-53.5] 


6 (24.0) 
[9.4-45.1] 


10 (40.0) 
[21.1-61.3] 


3.5 


Lung 
(n=26) 


62.0 (46-74) 


18 (69.2) 
8 (30.8) 


17 (65.4) 
9 (34.6) 


11 (42.3) 
14 (53.8) 
1 (3.8) 


1 (3.8) 
12 (46.2) 
6 (23.1) 
7 (26.9) 


0.83 
(0.1-3.1) 


Lung 
(n=26) 


1 (3.8) 
[0.1-19.6] 


1 (3.8) 
[0.1-19.6] 


11 (42.3) 
[23.4-63.1] 


5.5 


Bladder 
(n=16) 


65.0 (48-83) 


8 (50.0) 
8 (50.0) 


3 (18.8) 
13 (81.3) 


6 (37.5) 
10 (62.5) 
0 (0) 


1 (6.3) 
2 (12.5) 
9 (56.3) 
4 (25.0) 


0.69 
(0.2-2.3) 


Bladder 
(n=16) 


0 (0.0) 
[0.0-20.6] 


0 (0.0) 
[0.0-20.6] 


3 (18.8) 
[4.0-45.6] 


1.8 


Colorectal 


(n=12) 


65.0 (30-81) 


6 (50.0) 
6 (50.0) 


6 (50.0) 
6 (50.0) 


5 (41.7) 
7 (68.3) 
0 (0) 


0 (0) 
4 (33.3) 
3 (25.0) 
5 (41.7) 


1.14 
(0.0-2.7) 


Colorectal 
(n=12) 


0 (0.0) 
[0.0-26.5] 


0 (0.0) 
[0.0-26.5] 


1 (8.3) 
[0.2-38.5] 


1.8 


HER2 


Biliary tract 


(n=9) 


2 (22.2) 
7 (77.8) 


5 (55.6) 
4 (44.4) 


2 (22.2) 
6 (66.7) 
1 (11.1) 


4 (11.1) 
3 (33.3) 
2 (22.2) 
3 (33.3) 


1.00 
(0.0-2.8) 


Biliary tract 
(n=9) 


2 (22.2) 
[2.8-60.0] 


0 (0.0) 
[0.0-33.6] 


3 (33.3) 
[7.5-70.1] 


2.8 


66.0 (57-78) 


Cervical 
(n=5) 


49.0 (42-56) 


HER2 


5 (100) 
0 (0) 


5 (100) 
0 (0) 


1 (20.0) 
4 (80.0) 
0 (0) 


1.40 
(0.3-4.5) 


Cervical 
(n=5) 


1 (20.0) 
[0.5-71.6] 


1 (20.0) 
[0.5-71.6] 


3 (60.0) 
[14.7-94.7] 


20.1 


Endometrial 
(n=7) 


57.0 (54-74) 5 
(71.4) 
2 (28.6) 


7 (100) 
0 (0) 


2 (28.6) 
5 (71.4) 
0 (0) 


Endometrial 
(n=7) 


0 (0.0) 
[0.0-41.0] 


0 (0.0) 
[0.0-41.0] 


2 (28.6) 
[3.7-71.0] 


2.6 


Gastro- 
esophageal 
(n=5) 


67.0 (36-70) 
1 (20.0) 
4 (80.0) 


2 (40.0) 
3 (60.0) 


0 (0) 
5 (100) 
0 (0) 


0 (0) 
2 (40.0) 
1 (20.0) 
2 (40.0) 


0.80 
(0.4-4.3) 


Gastroeso- 
phageal (n=5) 


0 (0.0) 
[0.0-52.2] 


0 (0.0) 
[0.0-52.2] 
4 (20.0) [0.5- 
71.6] 


ALE 
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Ovarian 
(n=4) 


56.5 (38-58) 
4 (100) 
0 (0) 


4 (100) 
0(0) 


Ovarian 
(n=4) 


0 (0.0) 
[0.0-60.2] 


0 (0.0) 
[0.0-60.2] 


0 (0.0) 
[0.0-60.2] 


2.1 


NOS 
(n=16) 


59.0 (32-80) 
13 (81.3) 
3 (18.8) 


7 (43.8) 
9 (56.3) 


3 (18.8) 
11 (68.8) 
2 (12.5) 


1 (6.3) 
6 (37.5) 
2 (12.5) 
7 (43.8) 


1.35 
(0.0-5.4) 


NOS 
(n=16) 


1 (6.3) 
[0.2-30.2] 


0 (0.0) 
[0.0-20.6] 
3 (18.8) [4.0- 
45.6] 


1.9 


HER3 


NOS 
(n=16) 


66.0 (39-82) 


7 (43.8) 
9 (56.3) 


12 (75.0) 
4 (25.0) 


1 (6.3) 
12 (75.0) 
3 (18.8) 


0 (0) 

1 (6.3) 
11 (68.8) 
4 (25.0) 


1.13 
(0.3-4.5) 


HER3 


NOS 
(n=16) 


0 (0.0) 
[0.0-20.6] 


0 (0.0) 
[0.0-20.6] 


2 (12.5) 
[1.6-38.3] 


1.7 


Extended Data Table 2 | Treatment-emergent adverse events (occurring in > 10% of patients) 


Adverse event, n (%) 


Diarrhoea 


104 (73.8) 


Nausea 

Vomiting 

Constipation 

Fatigue 

Decreased appetite 

Abdominal pain 

Anaemia 

Dyspnoea 

Dehydration 

Aspartate aminotransferase increased 


Asthenia 


Weight decreased 


Characteristics of diarrhoea 


Action taken with neratinib, n (%) 


Permanent discontinuation 
Serious’ diarrhoea, n (%) 


Median (range) number of grade 3 diarrhoea 
episodes per patient 


Median (range) duration of grade 3 diarrhoea 


episode, days 


Median (range) time to first grade 3 diarrhoea 


episode, days 


*All events of grade 3. 


{Serious adverse event as defined per study protocol. 
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Neratinib monotherapy (N=141) 


Any grade 


61 (43.3) 
58 (41.1) 
49 (34.8) 
45 (31.9) 
40 (28.4) 
33 (23.4) 
22 (15.6) 
18 (12.8) 
17 (12.1) 
15 (10.6) 
15 (10.6) 
15 (10.6) 


Grade 23 


31 (22.0) 
3 (2.1) 
3 (2.1) 
2 (1.4) 
5 (3.5) 
1 (0.7) 
7 (5.0) 
10 (7.1) 
5 (3.5) 
8 (5.7) 
5 (3.5) 
1 (0.7) 

0 


4 (2.8) 
15 (10.6) 


4 (1-12) 


2 (1-8) 


10 (4-87) 


ARTICLE 
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Extended Data Table 3 | PET response criteria 


Response category Based on sum of SUV,,.x from 1 to 5 target lesions. Each target lesion with initial SUVmax of >1.5 x normal liver background SUVmax 


Complete metabolic response (CMR) e Reduction of SUVmax of all target lesions to less than normal liver background SUVmax (for non-brain lesions) or less than normal brain background SUVmax 
(for brain lesions) 
AND 
e The reduction of all other FDG-avid lesions consistent with disease to less than normal liver background SUVmax 


Partial metabolic response (PMR) e Sum of SUV... of all target lesions is decreased by 230% compared to baseline sum of SUVmax of all target lesions 
AND 
e No new lesions 


Stable metabolic disease (SMD) Not satisfying the criteria for CMR, PMR, PMD, or NE 
Progressive metabolic disease e Sum of SUV,,., of all target lesions is increased by 230% 
(PMD) OR 


e Appearance of one or more unequivocal new FDG-avid lesions 


Not evaluable (NE) Missing FDG-PET series or incomplete anatomy at follow-up timepoint 
A PET/CT scanner change from baseline 
Variation in FDG uptake time 215 minutes compared to baseline 


Change in reconstruction algorithm 


CT, computed tomography; FDG-PET, !8F-fluorodeoxyglucose positron-emission tomography; SUVmax, maximum standardized uptake value. 
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Extended Data Table 4 | Patient disposition by cohort 


Characteristic 


Patients continuing on treatment, 
n (%) 


Treatment discontinuation, n (%) 
Death 
Disease progression 
Clinical deterioration 
Adverse Event 
Investigator Request 
Withdrawal of consent 
Lost to follow-up 


Subjects ended study, n (%) 
Death 
Withdrawal of consent 
Lost to follow-up 
Other 


NOS, not otherwise specified. 


Breast 
(n=25) 


1 (4) 


24 (96.0) 
0(0 


15 (60.0) 
12 (48.0) 
2 (8.0) 
1 (4.0) 
0 (0) 


Bladder 
(n=16) 


1 (6.2) 


15 (93.8) 


14 (87.5) 
13 (81.3) 
0 (0) 

1 (6.3) 
0 (0) 


Lung 
(n=26) 


1 (3.8) 


25 (96.2) 
0(0 


16 (61.5) 
13 (50.0) 
2 (7.7) 
1 (3.8) 
0 (0) 


Colorectal 
(n=12) 


0 (0) 


12 (100) 
0(0 


9 (75.0) 
8 (66.7) 
0 (0) 

1 (8.3) 
0 (0) 


Biliary tract 
(n=9) 


1 (11.1) 


8 (88.9) 
1 (11.1) 
5 (55.6) 
1(11.1) 
1(11.1) 
0 (0) 
0 (0) 
0 (0) 


6 (66.7) 
6 (66.7) 
0 (0) 
0 (0) 
0 (0) 


HER2 


Cervical 
(n=5) 


2 (40) 


3 (60.0) 


1 (20.0) 
1 (20.0) 
0 (0) 
0 (0) 
0 (0) 


Endometrial 
(n=7) 


1 (14.3) 


6 (85.7) 


6 (85.7) 

5 (71.4) 

1 (14.3) 
0 (0) 
0 (0) 


ARTICLE 


HER3 
Gastro- 
esophageal Ovarian NOS NOS 
(n=5) (n=4) (n=16) (n=16) 
0 (0) 1 (25) 2 (12.5) 0 (0) 
5 (100) 3 (75.0) 14 (87.5) 16 (100) 
0 (0) 0 (0) 0(0) 1(6.3 
4 (80.0) 2 (50.0) 9 (56.3) 15 (93.8) 
1 (20.0) 1 (25.0) 2 (12.5) 0 (0) 
0 (0) 0 (0) 1 (6.3) 0 (0) 
0 (0) 0 (0) 2 (12.5) 0 (0) 
0 (0) 0 (0) 0 (0) 0 (0) 
0 (0) 0 (0) 0 (0) 0 (0) 
5 (100) 3 (75.0) 7 (43.8) 14 (87.5) 
3 (60.0) 3 (75.0) 7 (43.8) 11 (68.8) 
0 (0) 0 (0) 0(0) 2 (12.5) 
1 (20.0) 0 (0) 0(0) 1 (6.3) 
1 (20.0) 0 (0) 0 (0) 0 (0) 
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Dynamic basis for dG-dT misincorporation 
via tautomerization and ionization 


Isaac J. Kimsey!+*, Eric S. Szymanski, Walter J. Zahurancik*, Anisha Shakya*+, Yi Xue!+, Chia-Chieh Chu', 
Bharathwaj Sathyamoorthy'}, Zucai Suo** & Hashim M. Al-Hashimi!° 


Tautomeric and anionic Watson-Crick-like mismatches have important roles in replication and translation errors through 
mechanisms that are not fully understood. Here, using NMR relaxation dispersion, we resolve a sequence-dependent 
kinetic network connecting G-T/U wobbles with three distinct Watson-Crick mismatches: two rapidly exchanging 
tautomeric species (G°"°!-T/U=G-T°"°!'/U™!; population less than 0.4%) and one anionic species (G-T-/U~; population 
around 0.001% at neutral pH). The sequence- dependent tautomerization or ionization step was inserted into a minimal 
kinetic mechanism for correct incorporation during replication after the initial binding of the nucleotide, leading to 
accurate predictions of the probability of dG-dT misincorporation across different polymerases and pH conditions and 
for a chemically modified nucleotide, and providing mechanisms for sequence-dependent misincorporation. Our results 
indicate that the energetic penalty for tautomerization and/or ionization accounts for an approximately 10~? to 10~3-fold 
discrimination against misincorporation, which proceeds primarily via tautomeric dG°"°-dT and dG-dT*", with 
contributions from anionic dG-dT~- dominant at pH 8.4 and above or for some mutagenic nucleotides. 


In their paper describing the structure of the DNA double helix’, 
Watson and Crick proposed that if nucleotide bases adopted their ener- 
getically unfavourable tautomeric forms, mismatches (Fig. 1a) could 
pair up in a Watson—Crick (WC)-like geometry (Fig. 1b) and potentially 
give rise to spontaneous mutations. Decades later, it is well established 
that the replicative and translational machineries have a tight control 
over the WC geometry to discriminate against mismatches” °. There 
is also evidence that both tautomeric’ ° (Fig. 1b) and anionic” *!415 
(Fig. 1c) WC-like mismatches can evade such fidelity checkpoints and 
give rise to errors in replication®”’ and translation’®. Although they are 
central to the fidelity of information transfer in molecular biology, and 
despite growing evidence that spontaneous mutations are involved in 
cancer-causing alterations!’, the existence of WC-like mismatches and 
their contribution to replication and translation errors have not yet 
been definitively established. 

Tautomeric and anionic mismatches exist in a variety of chemical 
forms (Extended Data Fig. 1). For example, WC-like GeT/U mis- 
matches can form when either the guanine (Ge"°!.T/U and G-eT/U) or 
the thymidine or uridine ( GeT™/U™! and GeT-/U-) base assumes a 
rare enolic (Fig. 1b) or anionic (Fig. 1c) form. Although it remains 
unclear which WC-like mismatch contributes to replication and trans- 
lation errors, factors that stabilize different forms (for example, changes 
in pH”*!*!8 and chemical modifications'®) have been shown to 
increase the probability of misincorporation'*”°. Misincorporation 
probabilities can also vary markedly with sequence context, through 
mechanisms that are still poorly understood?!. The resolution of 
these different WC-like mismatches and their chemical dynamics is 
key to the elucidation of their potential roles in replication, transcrip- 
tion and translation errors. However, this presents a formidable 
challenge for current biophysical methods because these mismatches 
differ only in the placement of a single proton and a 7-bond (Fig. 1b, c 
and Extended Data Fig. 1). Protons are generally invisible to X-ray 


crystallography and cryo-EM”, and consequently it has not been 
possible to unambiguously resolve the identity of WC-like mismatches 
captured in the active sites of polymerases®”!>” and the ribosome 
decoding site?**, Moreover, WC-like mismatches are predicted to exist in 
rapid tautomeric equilibria (Ge T/U Go Trlenol) 25:26 (Fig. 1b, c 
and Extended Data Fig. 1), making them exceptionally difficult to 
capture experimentally. 

Techniques based on NMR relaxation dispersion””~*° enable the 
characterization of low-abundance, short-lived conformational states, 
known as ‘excited states’ (ESs), in biomolecules*°. Recently we used 
these techniques to provide evidence that wobble GeT/U mismatches 
exist in dynamic equilibrium with tautomeric (ES1) and anionic (ES2) 
WC-like mismatches in DNA and RNA duplexes®*!. The chemical 
shifts measured for guanine N1 (G-N1) and thymidine/uridine N3 
(T/U-N3) in tautomeric ES1 were consistent with G°"'sT/U, but were 
partially skewed towards GeT™°'/U™", This was interpreted as evidence 
for a rapid (on the chemical-shift timescale) equilibrium between a 
major G"°!.T/U and a minor GeT™°!/U"™" species®, The anionic ES2 
was detectable only at high pH (7.8 or above), and was heavily skewed 
in favour of GeT-/U™ with no evidence of G"eT/U. The roles of these 
various WC-like mismatches in replication and translation errors 
remain unknown. Here, by combining NMR relaxation dispersion 
and measurements of misincorporation rates, we resolved a kinetic 
network connecting two distinct tautomeric and one anionic WC-like 
mismatch, and established their relative contributions to dGedTTP 
misincorporation. 


Tilting the tautomeric equilibrium 

IfES1 does represent two tautomeric species in rapid equilibrium (Fig. 1b), 
it should be feasible to tilt the equilibrium (Ky = p geno /P eno sue) by 
changing the local sequence or structural context around the mismatch, 
or by modification of the bases (Fig. 1d). This in turn should lead to 
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Figure 1 | Tilting the rapid tautomeric equilibria in excited-state WC- 
like mismatches. a—c, Chemical structures of ground state wobble (a) and 
excited-state tautomeric (b) and anionic (c) WC-like GeT/U mismatches. 
X =H and CH; in uridine and thymidine, respectively. d, Tilting the 
rapid tautomeric equilibria using sequence (cyan), structure (red) and 


very specific changes in the chemical shifts (w) of ES1 G-N1 and 
T/U-N3, which are provided as population-weighted average values 
denoted as (ww) over the two species (Fig. le, left). Tilting the equilib- 
rium in favour of G°"°'*T/U should induce a downfield shift of w for 
ES1 G-N1, due to an increase in the population of deprotonated G"", 
and an upfield shift of w for ES1 T/U-N3, due to a decrease in the 
population of deprotonated T°"°'/U*!, and vice versa (Fig. le, left). A 
plot of (Awg.ni) against (Aw7/y.n3) is predicted to be linear (Fig. le, 
right), with a negative slope and intercept determined by the funda- 
mental chemical shifts of the tautomeric species (see equation (1) in 
Supplementary Methods). 

We measured !°N relaxation dispersion for 5 dGedT mismatches 
within distinct sequence contexts and for 13 rGerU mismatches in 9 struc- 
turally unique non-coding RNAs (Fig. 2a and Extended Data Fig. 2a). 
Experiments were carried out at near-neutral pH (6.4-6.9) in order 
to reduce levels of the anionic ES2 below detection limits® (Extended 
Data Fig. 2b). The relaxation dispersion experiments measure spin- 
relaxation rates in the rotating frame (Rj,) during a relaxation period 
in which a radiofrequency field is applied with variable offset (Q 204, 
in Hz) and power (w 277 |, in Hz), in order to suppress the chemical 
exchange contribution (R,x) to the transverse spin relaxation rate (R2) 
arising from chemical exchange between the energetically more stable 
ground state (GS) and the ES?”8. 

We observed G-N1 and T/U-N3 relaxation dispersion consistent with 
WC-like ES1 exchange for all five dGedT and eight rGerU mismatches 
located within helical environments (Fig. 2b and Extended Data 
Fig. 3a), thus establishing their widespread occurrence in DNA and 
RNA. No relaxation dispersion was observed (Extended Data Fig. 3b) 
for rGerU mismatches adjacent to apical loops, three-way junctions, or 
bulges (Extended Data Fig. 2a). This could be due to the lower abun- 
dance of WC-like mismatches when outside of the helical environment, 
although we cannot rule out the possibility that the exchange is orders 
of magnitude faster and beyond detection. 

As predicted on the basis of variably tilting the G°"°'’T/U=GeT?°!/ 
Um? equilibrium (Fig. le, right), plotting the fitted (Aw¢.n1) and 
(Aw y.n3) values obtained from two-state analysis (GS=ES1) of the 
relaxation dispersion profiles (Fig. 2b, Extended Data Fig. 3a and 
Supplementary Table 1) formed a line with a negative slope (Fig. 2c). 
As a negative control, the corresponding GS G-N1 and T/U-N3 
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T/U-N3 chemical shift (p.p.m.) 


(A@ry.n3) (p.p.m.) 


chemical (green) modifications. X-X/ and Y-Y’ denote WC base pairs 
adjacent to the GeT/U mismatches. e, Perturbations that differentially tilt 
the tautomeric equilibrium induce changes in the chemical shifts of G-N1 
(left) and T/U-N3 (middle); a plot of (Awg_y1) against (Aw+yy.n3) (right) 
is linear with a negative slope. 


chemical shifts were plotted and no correlation was observed (Extended 
Data Fig. 3c). We confirmed these linear trends using chemical modi- 
fications that tilt the tautomeric equilibrium towards enolic dT (dGedU 
and dGe*®"dU) or enolic dG (°®"dGedT) (Fig. 2c, Extended Data 
Fig. 3a and Supplementary Discussion 1). 


Sequence-dependent G"!:T/U = G-Ten'/ Ueno! 

A linear fit of the values obtained by plotting (Awg.ni) against 
(Awry-n3), assuming physically reasonable ranges, yielded fundamental 
chemical shifts for the tautomeric species that are in excellent agreement 
with values predicted by DFT calculations (Fig. 2c and Supplementary 
Tables 2, 3). The tautomeric equilibria (Supplementary Table 2) obtained 
from this analysis and from refitting the relaxation dispersion data using 
a three-state model with linear topology”® (wobble = GeT™/U"! = 
G°!.T/U) are slightly tilted in favour of dG*".dT in DNA (K,=2.1-4.6), 
whereas the populations of rG""srU and rGerU"" are more comparable 
in RNA (K,=0.5-1.1). These differences may be attributed to the 
electron-donating methyl group in dT, which destabilizes dT”! relative 
to ru"! (ref. 32). The relaxation dispersion data also enabled us 
to estimate a lower bound for the fast tautomeric exchange rate of 
Ky = ke genot_,-penot + k-penot_, geno > & 500, 000 — 1, 000, 000 s~}, faster than 
previously measured exchange processes by similar RD methods* 
(Fig. 2d and Extended Data Fig. 4), and a transition-state barrier for 
the conversion of GeT™°/Ue*! — G".T/U of <9-10 kcal mol~! 
(using kgT h7! as the pre-exponential factor and «= 1 as the trans- 
mission coefficient), which is in good agreement with values reported 
using computational methods (around 11.5 kcal mol~!)”°. These results 
establish the existence of GeT™°'/U™ and G°?"eT/U in an ultra-fast 
equilibrium, each of which can potentially contribute to replication and 
translation errors. 

Notably, the exchange parameters vary markedly with sequence context 
(Supplementary Table 1). The ES1 population (Pyg) = p ger! + Prpenot ; yeno) 
varies 3-fold in DNA and 8-fold in RNA, whereas the forward (kes _. ¢s1) 
and reverse (kgs) —, gs) rate constants vary by 4- and 5-fold, respectively, 
for DNA, and by 38- and 6-fold, respectively, for RNA (Supplementary 
Table 1). A linear correlation is observed between prs; and K; (Fig. 2e and 
Supplementary Table 2), indicating that the G°°°.T/U population domi- 
nates these variations with sequence and structural context. In DNA, 
these variations can potentially be explained by the sequence-specific 
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Figure 2 | Resolving rapidly interconverting tautomers. a, Representative 
hairpin (hp) DNA, RNA and chemically modified constructs. The name 
denotes the mismatch and sequence context (5/-3’). b, Representative 
G-N1 and T/U-N3 relaxation dispersion profiles (pH 6.8-6.9 and 25°C). 
Best fits to the Bloch-McConnell (B-M) equations are shown. ¢, A plot of 
ES1 chemical shift differences, (Awg.n) against (Aw+7/y.n3), in ES1 as 
measured by NMR relaxation dispersion. The blue line indicates the fit to 
the Aw values using the fundamental tautomer chemical shifts as variables 
(Supplementary Methods). The red line indicates predictions from density 


changes in stacking with immediate neighbours that accompany the 
transition from the wobble to the Watson-Crick geometry (Fig. 2f). For 
example, GGG has the highest pgs; and is predicted to gain stacking 
overlap, whereas CGG has the lowest pgs; and is predicted to lose stacking 
overlap. Similar sequence-dependent effects have been reported for lesion 
repair by methyltransferases*°. Notably, dG dominates the changes in 
stacking, potentially explaining the stronger sequence dependence of the 
G*"*.T/U population compared with that of GeT™°yUT", 


Sequence- dependent anion equilibria 

Next, we examined whether anionic WC-like GeT/U™ (Fig. 1c) also 
form commonly in DNA and RNA, and whether anionic GeT/U 
remains undetectable under these different environments. We 
measured relaxation dispersion at high pH (7.8 and above) for GeT/U 
mismatches in a subset of our RNA (Fig. 3a and Extended Data Fig. 5a) 
and DNA (Fig. 3b and Extended Data Fig. 5b) constructs. In all cases, 
we observed relaxation dispersion consistent with wobble = anion 
exchange, thus establishing the robustness of this process across 
different sequence contexts (Fig. 3a, b and Extended Data Fig. 5). 
Three-state fitting of the relaxation dispersion data, assuming a starlike 
topology (Extended Data Fig. 6a), yielded large Awy/y_n3(x82) values of 
approximately 55 p.p.m. and much smaller Awg_nigs2) values of 
approximately 5 p.p.m., consistent with a dominant GeT/U™ species 


functional theory. d, Lower bounds for the rate of tautomeric GL T/U= 
GeT°!/Ue"°! exchange. Contour plots showing scaled 2 weights for 
combinations of k; against kfenol/Uenols red indicates the better fit. e, A plot 
of tautomeric ES1 population against K, for DNA (blue, n =5) and RNA 
(cyan, n= 6) constructs determined at pH 6.9 and 25°C. f, Plot of ES1 
population against change in stacking overlap (AA? = A(W0)_A2(WB)) 
between wobble and Watson-Crick-like mismatches (pH 6.9, 25°C) for 
five DNA sequence contexts (Supplementary Methods). Error bars in b, c 
and e reflect experimental uncertainty (one s.d., Supplementary Methods). 


and with no evidence of G-eT/U. Again, we observe strong 
sequence-specific variations in the ES2 population (pgs) and in the 
values of kgs —. ps2 and kgs2 -. cs across different temperatures and 
pH values (Supplementary Tables 4, 5). 

A previous study® showed that the emergence of anionic ES2 at 
high pH values was accompanied by unexpected changes in the tau- 
tomeric chemical shifts of ES1. Similar deviations are observed here 
for both RNA and DNA (Supplementary Table 4). We postulated that 
‘minor’ exchange*® between ES1 and ES2 could ‘mix’ their chemical 
shifts and give rise to such deviations (Extended Data Fig. 6a, b and 
Supplementary Table 4). Indeed, all five relaxation dispersion pro- 
files with unusual ES1 chemical shifts showed a statistically signif- 
icant improvement when the data was fitted to a three-state model 
with minor exchange in a triangular rather than a starlike topology 
(Fig. 3a, b, Extended Data Fig. 6c, d and Supplementary Tables 4-6). 
The resulting ES1 '°N rG-N1 and rU-N3 chemical shifts vary less 
significantly with pH (Extended Data Fig. 6e and Supplementary 
Table 4) and the rate constants (kgs) —. ps2 and kgs2 — 51) exhibit the 
expected temperature dependence (Extended Data Fig. 6f), nei- 
ther of which would be expected if the data were being spuriously 
overfitted. It should be noted that limited or poor quality relaxation 
dispersion data can make it difficult to resolve different topologies*” 
(Supplementary Table 6). 
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Figure 3 | Three-state exchange with triangular topology and minor 
exchange between tautomeric and anionic WC-like excited states. 

a, b, Comparison of three-state B-M fit with triangular (left) and starlike 
(right) topologies with the relaxation dispersion profiles measured in 
hpUG-CGC RNA (a) and hpTG-GGC DNA (pb). Statistical Akaike’s 
information criterion and Bayesian information criterion weights 

(Waic and wpic; respectively) comparing starlike and triangular topologies 
are shown. Error bars reflect experimental uncertainty (one s.d., 
Supplementary Methods). 


Tautomerization and ionization during misincorporation 
dGedT misincorporation is the most frequent base-substitution error 
committed by high-fidelity DNA polymerases, with a misincorporation 
frequency Fo of 10-4-10~5 for most studied polymerases*** ° (where 
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Figure 4 | Kinetic mechanism of dGedT misincorporation. a, Exchange 
between wobble GeT/U mispair (top left), rapidly interconverting 
WC-like tautomers (G°"!'eT/U=GeT™!/U"!, bottom), and anionic 
WC-like GeT/U> (top right). Exchange between anionic GeT /U anda 
low-abundance, short-lived anionic GeT/U, or other non-WC species that 
cannot be detected by relaxation dispersion, cannot be ruled out. WC-like 
GeT/U populations and ranges recorded at pH 6.4-8.9 and 10-25 °C 
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F pol — (kpoi/ Ka)incorrect! (Kpoi/ Ka) correct Kol is the maximum rate of nucleo- 
tide incorporation and Kg is the apparent nucleotide equilibrium dis- 
sociation constant)—that is, an error is committed at a frequency of 1 
in every 104-10° nucleotide incorporations. Differences in apparent 
nucleotide binding affinities (Kg Mncortect) Kc --Mcorrect)) account for a 
factor of only around 10~!-10~? in discrimination*°, whereas differ- 
ences in the polymerization rates aaa | kpo""*) account for a 
factor of approximately 107%. 

The mechanisms that lower the values of kpot relative to kpot 
are still poorly understood. Several decades ago, Topal and Fresco pos- 
tulated that the frequency of tautomerization may be an important 
determinant of misincorporation probability'. Notably, the popu- 
lation of the tautomeric species (around 10~*) is comparable to the 
values of Oe aaa eager In addition, the rate at which the wobble 
dGedT forms either the WC-like tautomeric (kgs _, 5; =0.3-10 s7}; 
Supplementary Tables 1, 5) or anionic (kgs — zs2 = 1.1-124 s~}; 
Supplementary Tables 1, 5) mismatches (Fig. 4a) is comparable 
to the values of boo (0.16-1.16 s~') measured for incorrect 
dGedTTP or dGTPedT misincorporation”’, whereas it is up to 
approximately 1000-fold slower than kyoi°"®* (25-275 s~!) measured 
for correct dGedCTP or dGTPedC”””. If the formation of WC-like 
dGedT mismatches (Fig. 4a) is required for misincorporation after the 
initial binding of dNTP in a wobble conformation, it could provide a 
mechanism for lowering koe relative to kyo". Indeed, 
previous studies have shown that DNA polymerases cannot undergo 
the necessary conformational changes needed for catalysis when dGedT 
is in a wobble conformation’ and all available structures of catalyti- 
cally active polymerases with bound mismatches within the active site 
feature WC-like dGedT or dAedC geometries”. Similarly, WC-like 
rGerU mismatches have been shown to form in the first and second 
codon positions of catalytically active ribosomes’, in which wobbles 
are typically rejected®, which may help to explain translational error 
hotspots”. 

To examine this possibility, we built a kinetic model for dGedT TP 
misincorporation by inserting a tautomerization or ionization step 
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(Supplementary Tables 1, 5). b, Minimal kinetic mechanism for 
polymerization. Incorporation of an incorrect dTTP includes an 
additional tautomerization or ionization step, allowing for the formation 
of a Watson-Crick-like dGedT mismatch. Discriminatory steps are in red. 
E, open polymerase conformation; E’, closed conformation; E*, closed, 
catalytically competent conformation. 
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Figure 5 | Measured versus predicted misincorporation probabilities 
and rates. a, F,o as measured experimentally for dTTPedG 
misincorporation for human DNA polymerase ¢, rat DNA polymerase 8 
and T7 DNA polymerase, with values simulated using Mgsi, Meso, 

Megs) + E82 and Mx, (error bars represent one s.d., Supplementary 
Methods). b, Flux pathways for dT*dG(GGC) (top) and 5?"dUedG(CGC) 
(bottom), at different temperatures and pH values. c, Left, measured and 
simulated Fp values for dT TP/*"dUTP in AMV RT". Right, measured 
and simulated kp; values for dTTP/*""dUTP misincorporation for human 


(Fig. 4a) after initial nucleotide binding in a wobble conformation 
and before the pre-chemistry conformational change in the existing 
minimal kinetic model for correct incorporation” (Fig. 4b). All other 
steps, including the pre-chemistry conformational change and the 
phosphodiester bond formation, are assumed to have identical kinetic 
parameters as measured for correct nucleotide incorporation??**3-* 
(Supplementary Table 7). The model assumes that misincorporation 
directly from the wobble conformation is negligible and that the 
tautomerization and ionization rates measured in duplex DNA by 
NMR approximate the rates in the polymerase active site. We tested 
models (Extended Data Fig. 7) in which the tautomeric (Mgs1), 
the anionic (Mgs2), or both (Mes: + £82) species can be misincorpo- 
rated, as well as models that excluded the triangular network 
altogether (M x,). 

Notably, the most general Mgs) + ps2 model robustly predicts the 
measured Fool values for three polymerases (T7, polymerase ¢ and 
polymerase 8) that have varying rate limiting steps and microscopic 
rate constants (Fig. 5a, Extended Data Fig. 8 and Supplementary 
Table 7). Similar results are obtained with Mrs; under these neutral 
conditions, in which the ES2 population is negligible (<10~° at pH 6.9) 
(Fig. 5a and Extended Data Fig. 8). By contrast, Mrs consistently 
underestimates F,,i by one to two orders of magnitude’, whereas M x, 
overestimates F,,.) by one to two orders of magnitude” (Fig. 5a). 
Variants of the M x, model, in which only preformed tautomeric dNTP 
with populations of 10~4-10~° bind in a productive WC-like geometry, 
overestimate k,,.; and Kg by several orders of magnitude (data not 
shown). These data indicate that the formation of tautomeric WC-like 
dG*".dT and dGedT™” at a population of around 0.1% can account 
for the approximately 10°-10°-fold lower value of kyo" relative to 
kpo°"™, and that at neutral pH more than 99% of misincorporation 
proceeds via the tautomeric species, which form predominantly via 
direct exchange from the wobble (Fig. 5b). 
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DNA polymerase 8. Error is s.d. of n = 3 biological replicates for kinetic 
assays, or previously published error for AMV RT". Error for kinetic 
simulations is described in the Supplementary Methods. d, Measured and 
simulated kop; for dTTP misincorporation for human DNA polymerase 8 
in different sequence contexts. The asterisk indicates that ES2 exchange 
rates were extrapolated (Supplementary Methods). Error is s.d. of n=3 
biological replicates for kinetic assays; error for kinetic simulations is 
described in the Supplementary Methods. 


Impact of pH, modifications and sequence 

We also examined whether the Megs; + 252 model can reproduce the 
dependence of the misincorporation probability on pH, base modi- 
fication and sequence. Mgs) + gs2 accurately predicts the approxi- 
mately threefold increase in misincorporation probability observed 
with increasing pH (Fig. 5c, left). This can be attributed to an increase 
in the population of dGedT-, which accounts for more than 70% of 
the net misincorporation at pH 8.4 (Fig. 5b, c). By contrast, Megs; fails 
to predict this increase in misincorporation probability (Fig. 5c, left; 
Mx, not shown owing to absence of pH-dependent Kg values). 
At high pH values, the tautomeric and anionic species have compa- 
rable populations, and there is significant flux (greater than 
20%) towards both tautomeric and anionic species through the 
indirect minor exchange pathway (Fig. 5b). In this manner, the con- 
tributions of the tautomeric and anionic species to misincorporation 
are coupled. 

Megsi + gs2 and NMR relaxation dispersion measurements also accu- 
rately predict Fo and kop, for 5-bromo-2’-deoxyuridine triphosphate 
(dUTP) (Fig. 5c). This includes a sharper, approximately sixfold 
increase in a a lad measured for avian myeloblastosis virus 
reverse transcriptase (AMV RT) when increasing the pH from 6.9 to 
8.4 (Fig. 5c, left). This can be attributed to the lower pK, of dGe*®"dU- 
(pK, ~9) (ref. 14) relative to dGedT- (pK, ~11.8) (ref. 8). We further 
verified the robustness of these predictions by measuring Kkoptsreee 
and kept??? for human DNA polymerase ( at high pH (8.4). The 
model accurately predicts the approximately fourfold enhancement 

D kop orereUT? relative to kopst tT? (Fig. 5c, right). Again, Mgs; fails 
to predict these variations (Fig. 5c). Indeed, at both neutral and high 
pH values, °8"dUTP is predicted to be predominantly misincorporated 
via the more populated dGedT (Fig. 5b, c). These data indicate that 
misincorporation due to dGedT~ can dominate at pH values of 8.4 or 
above, or for chemically modified nucleotides at neutral pH. 
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Importantly, owing to the sequence-dependence of tautomerization 
and ionization, Mgsi + x82 also predicts sequence-specific variations in 
values of F,,1 of approximately eightfold at pH 8.4 (Fig. 5d). Comparable 
(fivefold) sequence-specific variations have been reported previously”). 
We tested these predictions using human DNA polymerase { at 
pH 8.4 for nine different sequence contexts (Supplementary Table 8). 
Whereas kop4G"eC? varied weakly (less than 1.2-fold) with sequence, 
Kopst@rat!? varied approximately 45-fold (Extended Data Fig. 9); 
larger changes were observed when the base pair was altered at the 
n —1 position, which stacks with dGedTTP in the polymerase active 
site (Fig. 5d). Although the Mgs) + gs2 predictions slightly underes- 
timate the sequence-specific variations in Kopst aT", this is not too 
surprising considering that other reaction steps could also vary with 
sequence. The predictions do recapitulate the lower kops¢@"""” for CGA 
and comparable values for GGC and CGC (Fig. 5d). Notably, the two 
major outliers (TGA and GGG) arise primarily because of a large ES2 
population. It is likely that the polymerase environment, including the 
absence of base pairs at the m + 1 position (Fig. 5d), can influence the 
sequence-specific dependence of tautomerization and ionization, and 
consequently influence misincorporation. 

Our data indicate that the formation of WC-like anionic and 
tautomeric mismatches help to determine the frequency of dGedT 
misincorporation and its dependence on pH, chemical modifications 
and possibly sequence. Our analysis indicates that F,,) is determined 
primarily by the ES1 population, and that considerable reductions in 
kex=kgs—esi + kesi—cs, outside of the range detected here, would be 
required to substantially reduce F,,,.j (Extended Data Fig. 10). Although it 
is likely that differences in the active site environment of the polymerase 
will tune tautomerization and ionization dynamics, the robustness 
of the predictions across different polymerases, pH conditions, and 
modified nucleotides suggests that it will not cause substantial per- 
turbations relative to the broad kinetic range examined here. Indeed, 
very small differences in tautomerization and ionization dynamics are 
observed for DNA and RNA, which have different helical structures 
and stabilities. It is possible that tautomerization and ionization are 
dominated by the energetics of hydrogen bonding and proton transfer, 
and that the natural grip over WC geometry in the double helix is similar 
to that achieved by the polymerase in the context of an isolated dNTP 
paired to the template. Other mechanisms may be applicable for 
purine-purine mismatches for which alterations in the active site have 
been proposed rather than the adoption of a WC-like base pair#*“*. 
The approach presented here can be applied to examine the roles of 
other tautomeric and anionic mismatches in replication, transcription, 
translation and DNA repair’. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references 
unique to these sections appear only in the online paper. 


Data Availability The data that support the findings of this study are available 
upon reasonable request from the corresponding authors. 
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Extended Data Figure 2 | DNA and RNA constructs used in this study. each mismatch. The DickersonTG-CGA, hpTG-CGC and hpUG-CGC 


a, Secondary structures of the various DNA and RNA constructs used sequences contexts were studied in a previous publication®. b, 2D [!°N, 'H] 
in this study. GeT/U mismatches that show signs of chemical exchange HSQC spectra of DNA and RNA constructs used in this study showing 
directed towards tautomeric and/or anionic WC-like mismatches are the imino resonances of G-N1/H1 and T/U-N3/H3 targeted for relaxation 
highlighted in blue and green, respectively. GeT/U mismatches that dispersion measurements. The spectrum shown for xptG was collected at 
show no evidence for WC-like relaxation dispersion are highlighted in pH 6.7 and 25°C in potassium acetate buffer as described previously®. 


brown. The value of K, measured at near-neutral pH is shown next to 
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three-state fits. Constructs containing a chemically modified base are 


50. Xue, Y. et al. Characterizing RNA excited states using NMR relaxation 
dispersion. Methods Enzymol. 558, 39-73 (2015). 
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Extended Data Figure 4 | Establishing lower limits for the rates 
of base pair tautomeric exchange. Agreement between measured and 
predicted Rj, values (scaled y? weight, equation (2) in Supplementary 
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Extended Data Figure 7 | Kinetic mechanisms used to model misincorporation. Rate constants for each step are listed in Supplementary Table 7 
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100 1M dNTP. DNA template sequence (5’ to 3’) is read from bottom 
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Extended Data Figure 10 | F,o1 is primarily governed by ES1 
populations. Simulated F,, values as a function of scaling up or scaling 
down of the kinetic exchange rate for ES1 formation (key = kgs — ps1 + 
kegsi — gs) without altering the ES1 population. Increasing k., beyond 
values measured experimentally in this study (green dotted line) 
minimally affects F,o; decreasing the k,, within the range measured 
experimentally in this study (purple dotted line) also affects the value 
of Fo only minimally. Much larger decreases in k,, are required to 
significantly reduce the value of Fpo- 
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Cryo-EM shows how dynactin recruits 
two dyneins for faster movement 


Linas Urnavicius!*, Clinton K. Lau’*, Mohamed M. Elshenawy?, Edgar Morales-Rios*, Carina Motz!+, Ahmet Yildiz?* & 


Andrew P. Carter! 


Dynein and its cofactor dynactin form a highly processive microtubule motor in the presence of an activating adaptor, 
such as BICD2. Different adaptors link dynein and dynactin to distinct cargoes. Here we use electron microscopy and 
single-molecule studies to show that adaptors can recruit a second dynein to dynactin. Whereas BICD?2 is biased towards 
recruiting a single dynein, the adaptors BICDR1 and HOOK3 predominantly recruit two dyneins. We find that the shift 
towards a double dynein complex increases both the force and speed of the microtubule motor. Our 3.5 A resolution cryo- 
electron microscopy reconstruction of a dynein tail-dynactin-BICDR1 complex reveals how dynactin can act as a scaffold 
to coordinate two dyneins side-by-side. Our work provides a structural basis for understanding how diverse adaptors 
recruit different numbers of dyneins and regulate the motile properties of the dynein-dynactin transport machine. 


Cytoplasmic dynein-1 (dynein) is the main transporter of cargoes 
towards the minus ends of microtubules in animal cells!. These cargoes 
move at a range of speeds” and vary in size from large organelles* 
to small individual proteins*. Dynein is activated to form a highly 
processive motor by binding its cofactor dynactin and a cargo adaptor, 
such as BICD2 (bicaudal D homologue 2)°*. Dynein contains two 
motor domains joined by a tail region, whereas dynactin is built 
around a short actin-like filament, capped at its pointed and barbed 
ends and decorated with a shoulder®®. A previous 8 A resolution 
cryo-electron microscopy (cryo-EM) structure showed how a coiled 
coil in BICD2 recruits the tail of dynein to the filament of dynactin®. 
Other adaptors that activate dynein and link it to different cargoes have 
been identified®!°"', These activating adaptors also contain long coiled 
coils; however, the sequence similarity between them is low!?-} and 
it is unclear whether they engage dynein and/or dynactin in the same 
way as BICD2 does. There is also evidence that certain adaptors— 
such as BICDR1! (BICD related-1, also known as BICDL1) and 
HOOK3*!®!!_drive faster movement of dynein towards the minus 
ends of microtubules when compared with BICD2, although the mech- 
anism underpinning this increased speed is not currently understood. 


Dynactin can recruit two dyneins 

We determined the cryo-EM structures of two previously unsolved 
dynein—-dynactin-adaptor complexes. BICDR1, like BICD2, binds 
RAB6 vesicles!®, whereas HOOK3 links dynein and dynactin to 
early endosomes!”!*. We determined 7 A resolution maps of both 
the dynein tail-dynactin-BICDR1 complex (hereafter termed TDR) 
and the dynein tail-dynactin-HOOK3 complex (hereafter termed 
TDH), which we compare to the previously determined structure of 
the dynein tail-dynactin-BICD2 complex (hereafter termed TDB)® 
(Fig. 1a, Extended Data Fig. la—-d, Extended Data Table 1). 

The coiled coils of all three adaptors run along the length of the dyn- 
actin filament (Fig. 1a). However, in contrast to previous predictions?°, 
each adaptor makes different interactions. BICD2 and BICDRI diverge 
in their path and relative rotation (Fig. 1b). HOOK3 follows yet another 
route over the surface of dynactin (Fig. 1c). TDH also shows an extra 


coiled-coil density near the pointed end of dynactin (Fig. 1c) and extra 
globular density towards its barbed end (Extended Data Fig. le, f). 
The identity of the second coiled coil is unclear, but the globular 
density probably corresponds to the N-terminal Hook domain, which 
is required for HOOK3 to activate dynein and dynactin'!!”. 

The most notable feature of TDR and TDH is the presence of two 
dynein tails (Fig. 1a). The tail of the first dynein (dynein-A) binds in 
an equivalent position to the dynein tail in TDB® and to the full-length 
dynein in the dynein-dynactin-BICD2 complex (hereafter termed 
DDB)”. The second dynein (dynein-B) binds next to dynein-A near 
the barbed end of dynactin. 


Adaptors determine dynein recruitment 

We determined whether BICD2, BICDR1 and HOOK3 recruited 
different numbers of dyneins in moving dynein-dynactin complexes. 
We mixed dyneins that had been labelled with tetramethylrhodamine 
(TMR) or Alexa Fluor 647, and used single-molecule fluorescence 
microscopy to measure the frequency at which the two dyes colo- 
calized on microtubules (Fig. 2a, b). In the presence of dynactin and 
BICD2, 13 + 1% (s.e.m.) of processive complexes were labelled with 
both dyes; this was significantly higher (P < 0.0001) than the colo- 
calization observed for the dynein-only control (2.1 + 0.3%). Using 
BICDR1 or HOOK3 as an adaptor led to colocalization percentages 
of 31+2% and 34 + 1%, respectively (Fig. 2b). After correction for 
complexes that were double-labelled with the same colour, we estimate 
that 26% of BICD2 complexes contained two dyneins, compared to 61% 
for BICDR1 and 67% for HOOK3. We conclude that the majority of 
motile complexes that contain BICD2 have one dynein, whereas both 
BICDRI1 and HOOK3 preferentially recruit two. 

Although both this study and previous work®”® are consistent 
with BICD2 predominantly recruiting one dynein, its ability to 
recruit a second was unanticipated. To verify this observation, we 
applied a mixture of BICD2, dynein tail and dynactin onto grids for 
negative-stain electron microscopy analysis (Fig. 2c). In agreement 
with our single-molecule data, 3D classification of adaptor complexes 
showed that 17 + 1% of BICD2 complexes contained two dyneins; 
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Figure 1 | Dynactin can recruit two dyneins. a, Sub-7 A cryo-EM maps of 
TDR and TDH complexes, coloured according to their components. TDB 
complex (Electron Microscopy Data Bank (EMDB) code: EMD-2862) is 
included for comparison. b, Surface representation molecular models of 
BICDRI1 and BICD2 on dynactin show the divergent paths of the coiled 
coils. c, Comparison of HOOK3 and BICD2 on dynactin. 


69 + 4% contained only one dynein, and the rest were ambiguous. The 
ability of BICD2 to bind two dyneins also agrees with a cryo-electron 
tomography reconstruction of microtubule-bound DDB”. Negative- 
stain electron microscopy of BICDR1 and HOOK3 complexes showed 
that 94 + 2% and 88 + 1% of these complexes, respectively, contained 
two dyneins (Fig. 2c). This suggests an even higher degree of second 
dynein recruitment than is indicated by our single-molecule data. Our 
data suggest that the number of dyneins bound to dynactin can be 
controlled by the identity of the adaptor. 


Two dyneins increase force and speed 

We next investigated how the recruitment of different numbers of 
motors affects the motile properties of the dynein-dynactin complex. 
We used an optical trap to measure the stall force of DDB, dynein- 
dynactin-BICDRI1 (hereafter termed DDR) and dynein-dynactin- 
HOOK3 (hereafter termed DDH) (Fig. 3a, b). Similar to our previous 
measurements”, the stall force of DDB is 3.7 + 0.2 pN, which is signifi- 
cantly lower (P< 0.0001) than the stall force of the plus-end-directed 
motor kinesin-1 (5.7 +0.2 pN)*. By comparison, the stall force of DDR 
is 6.5 0.3 pN and that of DDH is 4.9 +0.2 pN (Fig. 3b), which suggests 
that recruiting higher numbers of dyneins to dynactin increases force 
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Figure 2 | Different adaptors recruit different numbers of dyneins to 
dynactin. a, Kymographs of DDB, DDR and DDH. Moving complexes 
contain TMR-dynein (magenta), Alexa Fluor 647-dynein (green) or both 
(white, white arrowheads). Experiment was independently repeated twice, 
with over three replicates per repeat. b, The mean percentage + s.e.m. of 
complexes containing both TMR- and Alexa Fluor 647-dynein (n = 7,793 
(dynein), n = 3,092 (DDB), n= 3,107 (DDR), n= 3,990 (DDH); ANOVA 
with Tukey’s test; ****P < 0.0001; NS, not significant, P= 0.4010). 

c, Representative negative-stain electron microscopy 3D classes of 

one- and two-dynein complexes for TDB, TDR and TDH. Mean +s.e.m. 
fraction of particles in each class are shown. Ambiguous classes are 

not shown. 


production. This agrees with previous reports that concluded—on the 
basis of using dyneins with beads—that dyneins can team up efficiently 
for maximum mechanical output”**. The difference in stall force 
between the DDR and DDH dynein-dynactin complexes suggests that 
features other than motor number can also fine-tune force production. 

The higher stall force of DDR also suggests that it competes more 
efficiently with kinesin than DDB does. This may explain why 
neuronal overexpression of BICDR1, but not of BICD2, counteracts 
kinesin-driven transport of RAB6 vesicles'* and may be relevant to the 
role of BICDR1 in opposing anterograde movement in early neuronal 
differentiation'®. The ability of some adaptors to recruit multiple 
dyneins could also contribute to the clustering and pairing of dynein 
motors required to transport large cargoes**5, 

We next explored whether the recruitment of more dyneins to dyn- 
actin had an effect on speed. Previous work on BICDR1 in cells’ and 
HOOK3 in vitro*!®!!!? has shown that complexes containing these 
adaptors move faster than those containing BICD2. Our data raise the 
possibility that these faster speeds are due to an increased number of 
complexes with two dyneins. However, previous reports have suggested 
that, although artificially tethering dyneins increases run length, it has 
little or no effect on velocity**?”. 

To determine whether motor number affects the movement speed 
of dynein-dynactin complexes, we first directly compared all three 
adaptors in our in vitro motility assay. As expected, the run lengths 
of DDR and DDH were longer than that of DDB (Extended Data 
Fig. 2a). Notably, the average velocities of DDR (1.35 +0.04,1m s-!) and 
DDH (1.23 +0.04,1m s~') were significantly faster than that of DDB 
(0.86 +0.041m s~!, P< 0.0001) (Fig. 3c, Extended Data Fig. 2b, c). 
To investigate whether this difference in speed required the presence 
of two active dyneins, we mixed Alexa Fluor 647-labelled dynein with 
a TMR-labelled tail construct, BICDR1 and dynactin (Fig. 3d). For 
this experiment, we used a mutated full-length dynein that binds to 
dynactin as strongly as the dynein tail does*? but moves at wild-type 
velocities (Extended Data Fig. 2b, d). We compared the speeds of 


8 FEBRUARY 2018 | VOL 554 | NATURE | 203 


© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


ARTICLE 


a DDB DDR DDH Kinesin-1 b 
- 500 Hz - 500 Hz 1 
5 kHz 5 kHz 1 
Zz Zz 
2 2 
oO oO 
Q ° 
(eo) [e} 
tL. Li 


o 


Speed (um s7) 


Figure 3 | Two dyneins increase force and speed of dynein-dynactin. 
a, Traces showing typical stalls of beads driven by single DDB, DDR, DDH 
or kinesin-1. Experiments were independently repeated four, six, five and 
four times for DDB, DDR, DDH and kinesin-1, respectively. Arrowheads 
denote motor detachment from the microtubule after the stall. b, Scatter 
plots showing stall force distributions (n = 54 (DDB), n=53 (DDR), 

n= 118 (DDH), n= 83 (kinesin-1)). c, Mean speeds of DDB, DDR and 
DDH (n= 3,343 (DDB), n= 3,162 (DDR), n =3,744 (DDH)). d, Schematic 
depicting experimental design for TMR-tail-Alexa Fluor 647-dynein 
experiment. e, Dynein-only complexes move significantly faster than 
tail-dynein complexes (n = 1,004 (dynein-only), n = 939 (tail-dynein)). 

In b, c and e, horizontal lines represent mean + s.e.m., ****P < 0.0001 
(ANOVA with Tukey’s test for b, c; unpaired two-sided t-test for e). 


moving complexes that contained only full-length dynein (‘dynein- 
only’) with those that contained one tail and one active dynein (‘tail- 
dynein’). As expected, dynein-only complexes moved at a similar speed 
(1.25 +0.04m s~|, Fig. 3e) to DDR (1.22 +0.05 jm s~!, Extended Data 
Fig. 2d). However, tail-dynein complexes moved significantly more 
slowly (0.84 + 0.03 pm s~!, P< 0.0001, Fig. 3e, Extended Data Fig. 2e) 
than either DDR or dynein-only complexes. This suggests that the 
presence of a second dynein increases the velocity of dynein-dynactin 
complexes. 

We propose that the increase in speed on the recruitment of two 
dyneins is linked to the way in which dynactin recruits them side-by- 
side (Fig. 1a). This may restrict the inherent sideways and backwards 
movements of the motor domains”® and cause the complex to take 
a more direct and faster route along the microtubule. Other dynein 
regulators, such as LIS1 (otherwise known as PAFAH1B1), have been 
reported to increase the speed of dynein-dynactin complexes”*” and 
could act by increasing motor copy number. For LIS1, however, quan- 
titative fluorescence measurements suggest that this is not the case”’. 
The velocity of BICD2 complexes containing both fluorophores and 
therefore two dyneins (1.08 + 0.03 um s~!, Extended Data Fig. 2f) was 
significantly faster than the average DDB velocity (P < 0.0001), but not 
as fast as DDR. This suggests that, in addition to recruiting two dyneins, 
certain adaptors also affect speed through small differences in how they 
recruit the motors to dynactin. 


The dynein-dynactin-BICDR1 structure 

To investigate how dynactin recruits two dyneins, we collected data 
sufficient to determine the TDR structure to an overall resolution of 
3.5 A (Extended Data Fig. 3, Extended Data Table 1). To improve the tail 
density, we performed multiple rounds of particle signal subtraction, 
focused 3D classification and refinement on regions that moved as rigid 
blocks, which improved the definition of the blocks at each iteration 
(Extended Data Fig. 4). This produced a set of 3.4 A maps that covered 
the entire length of the dynein tail (Extended Data Table 1). 
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Figure 4 | Structure of the dynein HC and architecture of the TDR 
complex. a, Consensus molecular model of one dynein HC, complete 
with IC and LIC. HC is coloured according to helical bundle number. 

b, Assembled model of the TDR complex, showing the arrangement of 
dynein-A (cartoon) and dynein-B (cartoon) on BICDR1 (cartoon) and 
dynactin (surface). The HC NDD and dynein LC ROBLI of dynein-B are 
labelled. 


Previous low-resolution structures showed that the dynein tail 
comprises two heavy chains (HCs), which consist of a series of helical 
bundles held together by an N-terminal dimerization domain (NDD)*”. 
Each HC binds one intermediate chain (IC) and one light interme- 
diate chain (LIC)”?!. The IC N-terminal regions are held together 
by the dynein light chains, ROBL1 (otherwise known as DYNLRB1), 
LC8 (otherwise known as DYNLL1) and TCTEX1 (also known as 
DYNLT1)***. We used our high-resolution maps to build an atomic 
model of the dynein tail. We de novo traced helical bundles 1-6 of the 
HC and the WD40 domain of the IC (Fig. 4a, Extended Data Figs 5a, b, 
6a, Extended Data Table 1). We also placed helices for part of helical 
bundle 7 and rebuilt homology models for the LIC*! and ROBL1** 
(Fig. 4a, Extended Data Figs 5c, 6b, c). Our structure reveals that the IC 
WD40 domain makes extensive contacts with HC bundles 4 and 5, and 
that its central cavity is filled by a loop-helix from bundle 4 (Extended 
Data Fig. 6a). By contrast, the LIC globular domain interacts only with 
two helices from bundle 6. The tight binding of the LIC to the HC™*# 
is the result of its N- and C termini, which span out from the globular 
domain and form an integral part of HC bundles 5 and 7, respectively 
(Extended Data Figs 5c, 6b). 

We assembled and refined a model of the whole TDR complex 
(Fig. 4b, Extended Data Table 1, Supplementary Video 1) into our 
3.5 A map. We used a previous dynactin structure® and a model of the 
BICDRI coiled-coil region. For each dynein, we fit in two copies each 
of HC, IC and LIC, one ROBLI dimer and a new 1.9 A crystal structure 
of the human NDD (Extended Data Fig. 6d, e, Extended Data Table 2). 


Structural basis of two-dynein recruitment 

Our TDR structure shows the two dyneins binding to grooves on 
the surface of dynactin that are formed by its B-actin subunit and the 
three actin-related protein 1 (ARP1, also known as ACTRIA) subu- 
nits ARP1F, ARP1D and ARPIB (Fig. 4b). The two dynein-A chains, 
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Figure 5 | Interactions recruiting two dyneins to the TDR complex. 
a, Dynein HC (dynein-A, blue; dynein-B, pink) interactions with 
dynactin subunits (green). Contact residues on dynactin are shown as 
yellow spheres. For each HC, helix a6 is highlighted (dynein-A, dark 
blue; dynein-B, dark pink). b, A2 makes extensive interactions with B1. 
Interaction sites are shown as yellow and red spheres. c, Interactions of 


Al and A2, and the first dynein-B chain, B1, all bind in a similar way, 
by making contacts to both sides of their respective grooves. Although 
the precise interactions overlap, they are all slightly different from 
one another (Fig. 5a). The second dynein-B chain, B2, binds between 
ARP1B and the barbed-end capping protein CAPZ6 (CAPZB is also 
known as CAPZB) and is rotated by 90° along its long axis, relative to 
the other dynein chains. 

The two dyneins also make extensive interactions with one another. 
These consist of the IC WD40 domain of A2 binding the HC of B1; 
direct HC-to-HC interactions; and contacts between the A2 LIC and 
the HC and IC of B1 (Fig. 5b). These contacts are highly conserved 
across higher eukaryotes (Extended Data Fig. 7a). They contribute to 
a cascade of interactions (Supplementary Video 2) between the four 
dynein chains that include contacts between the IC WD40 domain of 
each chain and the neighbouring HC (Extended Data Fig. 7b). This 
network of connections stabilizes the binding of the second dynein 
and ensures all four HCs are held in a rigid orientation with respect 
to one another. This is likely to keep the dynein motor domains 
properly aligned and may be important for the increase in speed when 
two dyneins are recruited to the dynactin scaffold. 

Our structure reveals the key role BICDR1 has in recruiting two 
dyneins to dynactin. Dynein-A binds the adaptor in three places: its Al 
chain uses a single site on helical bundle 2, whereas its A2 chain binds 
via two sites (Fig. 5c). One of these sites also involves helical bundle 2, 
and the other uses helical bundle 5. Recruitment of dynein-B depends 
only on its B1 chain, which also uses sites on bundles 2 and 5. The first 
of these sites contacts the adaptor coiled coil in a position opposite 
the binding site of A2 (Fig. 5c). The second site does not directly 
contact the coiled coil, but instead touches density that packs against it 
(Fig. 5c, Extended Data Fig. 7c, d). Although the identity of this region 


Dynein-A2 


BICDR1 
HC bundle 5 


Extra density 
LIC? 


ARTICLE 


Barbed 
end 


Dynein-A 


Dynein-B 


BICD2 
Dynein-A 


dynein-A (top) and B1 (bottom) with BICDR1. Interaction sites marked 

in dark blue and red. Extra density from A2 LIC mediates the connection 
between B1 and BICDR1. d, Negative-stain electron microscopy 
reconstructions of DDB containing two dyneins (top) or dynein-A only 
(bottom), sliced to highlight BICD2. Arrows depict alternative positions of 
BICD2 at the barbed end of dynactin. 


is uncertain, there is a weak density connecting it to the LIC, which 
suggests that it corresponds to the flexible LIC C terminus (Extended 
Data Fig. 5c, 7d). This region of LIC has previously been shown to 
interact with BICD2 and HOOK3!"!, 


Adaptor position controls dynein number 

All three dynein-dynactin-adaptor complexes recruit dynein-A ina 
similar way, despite the differences in the positions of the adaptors 
themselves (Extended Data Fig. 8a). In TDR and TDH, dynein-B can 
contact the adaptor because the BICDR1 and HOOK3 N termini follow 
downward paths, stabilized by interactions with CAPZB. By contrast, 
in TDB no contact site for dynein-B is available because the adaptor 
is shifted upward towards the shoulder domain to contact ARP1A 
(Fig. 1b, Extended Data Fig. 8b). 

We investigated which structural changes allow BICD2 to recruit 
a second dynein (Fig. 2). We combined our negative-stain electron 
microscopy TDB datasets (Fig. 2c) to determine structures of suffi- 
cient quality to resolve the position of the coiled coil. We found that 
TDB with two dynein tails has BICD2 in a lower position, similar to 
BICDR1 and HOOK3 and different from its position in single-dynein- 
bound TDB (Fig. 5d). Our data suggest that a switch in the position 
in the N terminus of the adaptor is sufficient to account for dynein-B 
recruitment. 

In conclusion, we show that dynactin can act as a natural scaffold 
to line up two dyneins in close proximity. This arrangement results in 
a dynein-dynactin complex that moves faster and can produce larger 
forces when compared with complexes containing a single dynein. 
Our observations provide a mechanism by which cargo can control 
the output of the dynein-dynactin machine via the identity of its 
activating adaptor. 
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METHODS 

Cloning. The following adaptors were codon-optimized for expression in 
Sf9 cells (Epoch Life Science): mouse BICDR1 (BICDL1), human HOOK3 
residues 1-522 and mouse BICD2 residues 1-400. Adaptors were subcloned into 
pOmniBac- and pACEBacl-derived vectors for expression in Sf9 cells®. Tags were 
added for purification (a Hiss-ZZ tag with a TEV protease cleavage site (TEV), 
or a PreScission protease site (Psc) followed by a 2x strep affinity tag) or protein 
labelling (GFP or SNAPf (NEB)). The following constructs were generated: 
pOmniBac-Hisg-ZZ-TEV-SNAPf-BICD2!, pOmniBac-Hisg-ZZ-TEV-BICDR1, 
pOmniBac-Hiss-ZZ-TEV-SNAPf-BICDR1, pOmniBac-Hiss-ZZ-TEV-BICDRI- 
SNAPFf, pOmniBac-Hiss-ZZ-TEV-BICDR1-GEP, pACEBacl-HOOK3!°2-SNAPf- 
Psc-2 x strep and pACEBacl-HOOK3!"°?-GFP-Psc-2 x strep. 

We generated a new dynein tail construct containing residues 1-1,455 of the 

human dynein HC. The fragment of the Sf9-codon-optimized DYNC1H1 gene 
was cloned into a pACEBacl vector contain an N-terminal Hisg-ZZ-TEV tag and 
fused to pDyn2 (containing genes for human IC2C, LIC2, TCTEX1, LC8 and 
ROBLI) as described®. 
Protein purification. Dynactin was purified from pig brains using the large-scale 
SP sepharose protocol®. Wild-type human dynein and a mutant open dynein’ 
were expressed and purified using baculovirus as described’. The two dynein tail 
constructs (HCh*> and SNAPf-HC!!°74_GST) were purified as described’. 

His¢-ZZ-TEV-tagged adaptor constructs were purified as described®. C-terminal 
Psc-strep-tagged constructs were purified as follows: pellets from 500 ml of Sf9 cell 
culture were resuspended in 50 ml of lysis buffer (30 mM Hepes-KOH pH 7.2, 
50mM KAc, 2mM MgAc, 1mM EGTA, 10% (w/v) glycerol, 1 mM DTT) plus 
one complete-EDTA protease-inhibitor tablet (Roche) and 1mM PMSE. Cells 
were dounced (30-40 strokes) and the lysate clarified in a Ti70 rotor (Beckman 
Coulter) at 30,000 r.c.f. for 20 min at 4°C before loading onto a 1-ml Streptrap HP 
column (GE Healthcare) pre-equilibrated in lysis buffer. The column was washed 
with 20 column volumes of lysis buffer and bound protein eluted using 7 column 
volumes of lysis buffer plus 3 mM p-desthiobiotin. Protein-containing fractions 
were concentrated to approximately 5mg ml! in 30-kDa cut-off Amicon 
centrifugal filters (Merck Millipore) and gel-filtered using a Superose-6 10/300 
column (GE Healthcare) pre-equilibrated in buffer containing 25 mM Hepes-KOH 
pH 7.2, 150mM KCl, 1mM MgCh, 5mM DTT. 

A C-terminal GFP-tagged truncated human kinesin-1 (K560-GFP) was 
prepared as described®. 

N-terminal dimerization domain crystallization. Residues 1-201 of DYNC1H1 
were expressed using a modified pRSET(A) plasmid**. Seleno-methionine- 
labelled NDD was expressed in a SoluBL21 Escherichia coli strain as described?”. 
It was purified from 2 | of culture using a 5-ml HisTrap column (GE Healthcare). 
Fractions were pooled, concentrated in a 30-kDa Amicon and applied to a 
Superdex200 10/300 gel filtration column (GE Healthcare) equilibrated with buffer 
containing 50 mM Tris-HCl pH 7.4, 150mM KAc, 10mM 8-mercaptoethanol, 
2mM MgAc, 1mM EGTA, 10% (v/v) glycerol and inhibitor tablets (1 tablet: 
100ml, complete-EDTA free). The NDD peak was concentrated to 10mg ml 1. For 
protein crystallization, 2 11 of protein was mixed with 2 1] precipitant (100 mM 
NaAc, pH 5.5, 10% (v/v) glycerol, 50 mM CaAc, 20% PEG 2,000 MME). Crystals 
were grown at 18°C by hanging drop for 48 h, collected with microloops (Mitegen), 
dipped into mother liquor containing an extra 15% (v/v) glycerol and flash-frozen 
in liquid nitrogen. Single-wavelength anomalous diffraction data were collected at 
1D29 beamline at ESRF, and integrated and scaled by the EDNA auto pipeline*’. 
The structure was solved in PHENIX®’, built in COOT” and refined using 
REFMAC". 

Cryo-EM of TDR. Cryo-EM grids for TDR (dynein tail (HC!°)-dynactin- 
BICDR1) were prepared similarly to TDB, though no cross-linker was used. 
Protein concentrations were chosen to give densely packed particles (approximately 
100 per image). Micrographs were recorded using FEI Titan Krios equipped with 
Falcon II detector (300 kV, 32 frames, 2-s exposure, 1.34A per pixel, 52 e~ per A?) 
using automated data collection (EPU, FEI). Seven images were collected per hole 
(26,906 images in total, over 11 separate sessions). Drift correction was performed 
using MotionCor2” and contrast transfer function (CTF) parameters estimated 
using CTFFIND3_130307°. Subsequent processing used Relion v.2.0“4 unless 
otherwise stated. 

Micrographs were first manually examined to remove images with a large 
amount of contamination, very low number of particles (<15), substantial uncor- 
rected drift, a large astigmation, extreme defocus values (<1 jum or >6,1m) and/or 
abnormal Fourier patterns. We selected 23,945 micrographs that had good signal to 
at least 8 A. For the first dataset, a small set of particles was manually picked from 
8x binned micrographs and subjected to reference-free 2D classification. A selec- 
tion of 2D-class averages that represented a range of different size and shape parti- 
cles present in micrographs (not just TDR complexes) was selected, centred using 
the Relion shift_com function, low-pass filtered to 50 A and used as references 
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for autopicking all binned micrographs using Relion v.1.4"°. For optimal particle 
picking of other datasets, we used 2D classes obtained from multiple sessions. We 
also used a value of 1.1 for the ‘maximum standard deviation of the background 
noise’ setting to ensure we picked all good TDR particles with high contrast. This 
value resulted in substantial levels of contaminants but these were removed by 
extensive manual screening, as indicated below. Initially, particles with a high 
‘autopick figure of merit’ values were screened to remove false positives. Then, 
the particles were cleaned by several cycles of 2D classification. At each cycle, the 
only particles that were discarded were those that were obviously dynein tail or 
contamination. In addition, particles with high ‘log likelihood contribution values 
were manually screened for remaining false positives and particles containing 
impurities with a very strong signal. After several cycles of 2D classification, 
particles corresponding to free dynactin in its dominant view were discarded. To 
do this, we first subjected particles from these classes to an additional round of 2D 
classification with the ‘image alignment’ setting turned off, in order to recover any 
TDR particles. Some of the dynactin classes of other-than-dominant view could 
not be easily distinguished from the TDR complex. Therefore, particles from all 
other dynactin-like classes were combined with all TDR particles and subjected 
to 3D refinement with the previous TDB structure® low-pass filtered to 60 A as 
a reference. The output translational information from this 3D refinement was 
used to obtain more accurate coordinates of the particles in individual micro- 
graphs (script from R. Fernandez-Leiro). These coordinates were used to extract 
re-centred particles from 8-binned micrographs, which were manually cleaned as 
described above. After this cleaning, screened particles were subjected to another 
round of 3D refinement followed by 3D classification, to improve the separation 
of dynactin and TDR particles. In all the steps described above (picking, 2D clas- 
sification, and 3D refinement and classification), the option to ‘ignore CTFs until 
first peak was turned on. Translational information from the 3D classification was 
used as above to extract re-centred TDR particles from unbinned micrographs. 
The 3D refinement using the unbinned particles from the first dataset yielded a 
6.5 A resolution map, and the 3D refinement using all 11 datasets yielded a 3.5A 
resolution map. 

3D classification (see above) revealed movement of the dynein tails, which 
resulted in a lower resolution for these parts of the map. As a result, we conducted 
focused 3D classification and 3D refinement of the dynein tail as described*®, 
except that we used multiple rounds of mask optimization. First, we generated 
overlapping binary masks covering the N- and C-terminal densities of all four 
dynein chains and used the particle subtraction function in Relion to subtract the 
density outside these regions from the raw TDR particles. Next, 3D refinement 
was used to align these ‘subtracted particles’ on the basis of the remaining density. 
Then we used 3D classification with no alignments to investigate which parts of 
the structure moved as a rigid block. We then made new masks around the rigid 
block and repeated all the steps of particle subtraction, 3D refinement and 3D 
classification. This process was repeated several times to obtain the optimal mask. 

In the case of the N-terminal region of the tail, the optimal mask was used for 

a final round of particle subtraction and 3D refinement, which resulted in a 344 
map. To further improve the densities for the IC WD40 domain, we performed 
local sub-volume averaging with Chimera 1.10”. Similarly, for the C-terminal 
region of the dynein tail we performed a round of particle subtraction and 3D 
refinement using the optimal mask. We then used 3D classification with no 
alignment to identify the most homogenous particles for different local regions. 
For HC residues 500-927, we performed a 3D classification using the whole of the 
optimal mask. For ROBLI or LIC, we performed 3D classification using the local 
mass around their respective regions. In all cases, selected particles were refined 
using the whole optimal mask. All density maps were corrected for the modulation 
transfer function of the detector, and then sharpened by applying a negative B 
factor that was estimated either using automated procedures within Relion or by 
manually set parameters. 
Model building and refinement. Building was performed in COOT and refine- 
ment in PHENIX. The dynein HC residues 201-710 from dynein-B1 and the IC 
from dynein-A2 were de novo built and refined into the ‘N-terminal tail’ map 
guided by maps generated by local sub-volume averaging, with improved density 
for flexible loops. HC residues 500-927 from dynein-A2 were built and refined 
into the ‘C-terminal tail’ map. A “LIC-mask map was used to model secondary 
structure elements for HC residues 927-1,057 and to rebuild a Phyre2** homology 
model for human LIC2 (both dynein-A2). A ‘Robl-mask map was used to rebuild 
a homology model for the ROBL1-IC-extended-N termini complex and identify 
its interactions with the dynein-A2 IC WD40 domain. The separately built com- 
ponents were combined to generate a consensus model for dynein-A2. 

The structure of TDR was assembled and real-space refined into an overall 
TDR map that was not sharpened, and was filtered to 8 A resolution. We used 
four copies of the HC-IC-LIC consensus model, two copies of the NDD, a model 
of dynactin® and a stretch of coiled coil for BICDR1. The combined model was fit 
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into the 3.5 A overall map. Sections of dynactin, including the CAPZa3 dimer and 
the N terminus of subunit p50, were rebuilt. Corrections were made to secondary 
structure elements in the pointed end and shoulder domains. An approximate 
model for the BICDRI coiled coil (residues 105-392) was generated by placing 
helices into density, and assigning registry on the basis of fitting residues Phe159 
and Trp166 into the bulky density in the core of the coiled coil. Regions of the TDR 
model in weak density were set to zero occupancy for refinement into the overall 
3.5 A map. Segments of the final model were re-refined against the N-terminal and 
C-terminal maps (Extended Data Table 2). 

Cryo-EM of TDH. To assemble the TDH complex, dynein tail, dynactin and 
HOOK3!"%*2_SNA Pf were mixed in a 2:1:20 molar ratio and incubated on ice for 
15 min. The sample was cross-linked to increase the amount of complex formed by 
the addition of 0.0125% (v/v) glutaraldehyde (Sigma-Aldrich) at room temperature 
for 15 min before quenching with 200 mM Tris pH 7.4 (final concentration). 
The sample was gel-filtered using a TSKgel G4000SW x, (TOSOH Bioscience) 
equilibrated in 25 mM Hepes-KOH pH 7.2, 150mM KCl, 1mM MgCh, 0.1mM 
Mg.ATP, 5mM DTT. The TDH complex was concentrated in a 100-kDa cut-off 
Amicon at 1,500 r.c.f. to 0.1-0.2 mg ml-!. Three microlitres of the TDH sample 
were applied to freshly glow-discharged Quantifoil R2/2 300-square-mesh copper 
grids covered with a thin carbon support. Samples were incubated on grids on a 
FEI Vitroblot IV for 45s and blotted for 3-4.5s at 100% humidity and 4°C, and 
then plunged into liquid ethane. 

Micrographs were recorded using automated data collection (FEI EPU) on an 

FEI Titan Krios with a FEI Falcon III detector in linear mode: 300 kV; 59 frames 
during a 1.5s exposure; 1.42 A per pixel; 45 e~ per A’; and 5 images per hole. 
Correction of inter-frame movement for each pixel and dose-weighting was 
performed using MotionCor2. CTF parameters were estimated using GCTF®. 
A reference set of 2D classes was generated using Relion v.2.0 from a small set of 
particles picked by the EMAN2* Swarm boxing tool. Gautomatch (http://www. 
mrc-lmb.cam.ac.uk/kzhang/) was used to pick particles from all micrographs 
(4x binned) using this reference set. Relion v.2.0 was used for 2D classification 
to clean the autopicked particles. An 8.2 A resolution TDB structure®, low-pass 
filtered to 60 A, was used as an initial model for a first round of 3D refinement. The 
dataset was cleaned by 3D classification using the output from 3D refinement as 
a reference. The cleaned particles were re-extracted from unbinned micrographs 
and used for a final round of 3D refinement, yielding a 6.7 A map. 
Negative stain electron microscopy analysis. The dynein tail-dynactin-adaptor 
triple complexes were assembled by mixing 1.711 of | mg ml! dynein tail (HC"™°), 
1 il of 1.35mg ml! dynactin and 21] of 1.3mg ml cargo adaptor (BICD2!4°— 
GFP, HOOK3!*2_-SNAPf and BICDRI-GEP). The samples were incubated on 
ice for 15 min before diluting eightfold for preparation of negative stain grids as 
described®. Two replicate samples were made on separate days. Between 200 and 
400 micrographs of each sample were recorded using FEI EPU on a FEI Falcon II 
direct electron detector, fitted to a FEI F20 electron microscope operated at 200kV: 
1-s exposure; 2.08 A per pixel; 20 e~! per A?; 0.8-1.2\1m underfocus. A small set 
of particles was picked from 4x binned micrographs using the EMAN2 Swarm 
boxing tool. Subsequent processing was done by Relion v.2.0 unless otherwise 
stated. Particles were extracted and subjected to reference-free 2D classification. 
From two to five 2D-class averages of triple complex were centred using a 
shift_com command, low-pass filtered to 50 A resolution and used as references 
for automated particle picking within Relion v.1.4 of all 4x binned micrographs. 
Autopicked particles were extracted, split in half and subjected to 2D classification 
as above. Ten 2D-class averages representing different particle orientations were 
selected and used for another round of autopicking. This third round of auto- 
picking was used to obtain the optimal particle selection (fewest missed particles) 
and best centring. 

The resulting particles were subjected to three rounds of 2D classification to 
identify good complex particles (particle numbers for TDB: 6,382, 7,648, 5,534, 
and 6,163; TDH: 3,430 and 3,958; TDR: 1,713, 1,861, 5,782 and 5,388). For the best 
3D classification, these sets of particles were first subjected to 3D refinement using 
the cryo-EM structure of dynactin (EMDB code: EMD-2856), low-pass filtered to 
60A, as a reference. 3D classification was carried out using the map from the 3D 
refinement, as well as the orientation and rotational assignment for the particles. 
3D classification was carried out by setting the regularization parameter T to 2 
and gradually adjusting image alignment sampling: 10 iterations with 15° angular 
sampling interval, offset search range set to 8 pixels and step to 2 pixels; 25 
iterations with an angular sampling interval of 7.5°, offset search range set to 
5 pixels and step to 1 pixel. The gradual reduction in sampling yielded the best 
classification. It was followed by 15 iterations with an angular sampling interval 
of 3.7°, offset search range set to 3 pixels and step to 0.5 pixels, which yielded 3D 
classes with density maps of sufficient quality to identify the number of dynein 
tails bound to dynactin. The fraction of complexes containing one or two dyneins 
was calculated from the number of particles assigned to each class. 


To determine the path of BICD2, all particles of the complexes from both TDB 

(BICD2!-*°_GFP) datasets were combined (42,823 particles). Binned particles 
from these complexes were subjected to 3D refinement as described above. These 
coordinate files were used to re-centre—as described above—and extract particles 
from unbinned micrographs, with CTF parameters that were determined using 
CTFFIND3_130307. Extracted particles were subjected to 3D refinement, followed 
by 3D classification. In both steps, the CTF correction was set to ignore CTFs until 
the first peak. Particles from 3D classes with dynein-A only, or with dynein-A 
and dynein-B, were separated and processed separately. Each set of particles was 
subjected to another round of 3D refinement and subsequent 3D classification 
(using 25 iterations: 7.5° angular sampling interval, offset search range of 5 pixels 
and step of 1 pixel) to obtain the best particles for each complex (13,278 and 14,070 
particles for TDB with dynein-A only and with two dyneins, respectively). These 
were used in a final round of 3D refinement. Molecular models of dynactin, cargo 
adaptors and dynein tails were fitted into the density maps and used to colour 
different segments in Chimera. 
Single molecule assays. SNAPf—dynein or SNAPf-tail complexes were labelled 
with TMR-Star or Alexa Fluor 647 (NEB) and purified separately, and the 
percentage labelling was quantified as previously described®. Single-molecule 
assays were conducted as previously described®, with slight modifications in order 
to optimize the number of moving complexes. Under these optimized conditions, 
DDB complexes moved faster than observed in previous publications>®!°"!, All 
adaptor complexes were measured under identical conditions. The percentage of 
processive events was 56% for DDB, 76% for DDH and 75% for DDR. 

Microtubules were made by mixing 261] of 5.2mg ml! unlabelled pig tubulin, 
5 ul of 2mg ml“! HiLyte Fluor 488 tubulin and 1011 of 2mg ml? biotin tubulin 
(Cytoskeleton) in BRB80 buffer (80 mM PIPES pH 6.8, 1mM MgCh, 1mM 
EGTA, 1mM DTT). The mixture was incubated on ice for 5 min before adding 
41 ,l of polymerization buffer (2x BRB80 buffer plus 20% (v/v) DMSO and 2mM 
Mg.GTP). Microtubules were polymerized at 37 °C for 30-60 min. The sample was 
diluted with 200 ,1l of microtubule buffer (BRB80 plus 401M paclitaxel). Excess 
tubulin was removed by pelleting (20,238 r.c.f., 8.5 min; at room temperature). The 
microtubule pellet was washed with 200 11 of microtubule buffer and re-pelleted 
as above. The microtubule pellet was re-suspended in 200 ul of microtubule buffer 
and stored at room temperature, and covered from light for at least half a day (to a 
maximum of three days) before use. Microtubules were visualized (see below) and 
if the density was too low, or the free tubulin concentration was too high, they were 
re-pelleted and re-suspended in a smaller volume of microtubule buffer. 

Motility chambers were made by applying two strips of double sided-tape (Tesa) 
approximately 8-10 mm apart on a glass slide and placing a cleaned coverslip® on 
top. The glass surface was functionalized with 0.4mg ml biotinylated poly(t- 
lysine)-g-poly(ethylene-glycol) (SuSoS AG). The chamber was immediately 
washed with 40 11 of assay buffer (30 mM Hepes-KOH pH 7.2, 5mM MgSOg, 
1mM EGTA, 1mM DTT). Subsequently, 10j11 of 1 mg ml streptavidin (NEB) was 
flowed through and immediately washed with 4011 of assay buffer. The chamber 
was then incubated with freshly diluted microtubules (typically 3 jul of microtu- 
bules and 101] of assay buffer). Microtubules were immediately washed out with 
assay buffer, followed by assay buffer supplemented with 1.25mg ml! a-casein 
(Sigma-Aldrich). 

Dynein-dynactin-cargo-adaptor complexes were prepared by mixing 1 1l of 
1mg ml! fluorescently labelled dynein, 111 of 1.35mg ml! dynactin and 211 of 
1.3mg ml! cargo adaptor (SNAPf-BICD2'“, HOOK3!"-*2-SNA Pf and SNAP£- 
BICDR1). For experiments with SNAPf-tail, an additional 1 jul of 1 mg ml—! TMR- 
labelled tail was added. The complex was incubated on ice for 10-15 min, and 
then diluted with assay buffer to a final volume of 1011. One microlitre of complex 
solution was added to 19 1] of assay buffer containing 1.25mg ml! a-casein, an 
oxygen scavenging system (0.2 mg ml! catalase (Calbiochem) and 1.5mg ml! 
glucose oxidase (Sigma-Aldrich)), 0.45% (w/v) glucose, 1% BME, 25mM KCl 
and 5mM Mg.ATP. Taxol was omitted from this buffer to reduce the non-specific 
background and non-moving events. The motility mix was flowed into the cham- 
ber, and washed with assay buffer supplemented with 1.25 mg ml! a-casein for a 
second time. The sample was analysed immediately at 23 + 1°C with a total internal 
reflection fluorescence microscope, as previously described®. Colocalization and 
velocity were determined from the same datasets (4, 8, 9 and 7 chambers overall, for 
dynein-only, DDB, DDH and DDR, respectively). Data were collected on two differ- 
ent days. Tail and full-length dynein speed data were collected from three chambers. 

Data were analysed using Image]*!. Tif stacks were Z-projected to identify 
the paths of microtubules, segmented lines drawn along them and kymographs 
generated using the reslice function. Processive movements were defined as 
previously described’. Velocity was calculating using a pixel size of 105nm anda 
frame rate of 236 ms per frame. 

Colocalization data were collected using a DV2 beam splitter (Photometrics), 
which projected each channel (TMR and Alexa Fluor 647) onto a different half 
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of the image. Movie tif-stacks were split and kymographs generated for each 
channel. Kymographs were overlaid using the ‘Colour:Merge’ function to generate a 
composite image. The kymographs were manually scored for processive events that 
showed colocalization, followed by those events that only appeared in individual 
channels. Colocalization in the dynein-alone control chambers was determined 
for all microtubule binding events longer than 2 frames. 

The fraction of complexes containing two dyneins, d, was calculated from the 
fraction of total events with signal for TMR-only (Robs), Alexa Fluor 647-only 
(Gops) and colocalized signal (Yops). Colocalized events represent dynein-dynactin-— 
adaptor complexes with two dyneins, and single-coloured events represent a 
mixture of both one- and two-dynein complexes. We can express this using the 
following three equations: 


Robs = (s x r) + (d x r?) 


Gobs = (s X r) bs (dx g*) 


Yobs = d x 2(r x g) 


Where s is the fraction of complexes that contain one dynein, r is the fraction 
of TMR-labelled dynein and g is the fraction of Alexa Fluor 647-labelled dynein 
flowed into the chamber. These equations hold at high labelling efficiency—our 
dynein was >94% labelled for a dynein monomer, or >99.64% per dimer—where 
dynein can be labelled by either TMR or Alexa Fluor 647 (r + g= 1). We can 
therefore solve these equations for d without knowing r or g. 
Stall force measurements. Eight-hundred-nanometre carboxy latex beads (Life 
Technologies) were functionalized with anti-GFP antibodies”’. Dynein, dynactin 
and cargo adaptor (BICD2"~“°°-GFP, HOOK3! °”*-GFP and BICDR1-GEP) were 
mixed at a 1:5:20 molar ratio and incubated for 10 min on ice in dynein motility 
buffer (DMB: 30 mM HEPES pH 7.0, 5mM MgSO,, 1mM EGTA) supplemented 
with 1 mM tris(2-carboxyethyl)phosphine (TCEP) and 1 mg ml! BSA. The diluted 
mixture was incubated with the anti-GFP-beads for 10 min on ice. Cy5-labelled 
axonemes”! were introduced to the sample flow chamber, which was washed 
with 40 jl of DMB plus 1mM TCEP and 500g ml! casein. The protein—bead 
mixture was introduced to the chamber in imaging buffer (DMB with 1 mM TCEP, 
500 1g ml“! casein, 2.5mM protocatechuic acid, 35 ug ml“! protocatechuate-3,4- 
dioxygenase (PCD), 2mM Mg.ATP). Kinesin was diluted in BRB80 supplemented 
with 1.5mg ml! casein and 2mM DTT and mixed with anti-GFP beads for 10min 
on ice. Kinesin-coated beads were introduced to the sample chamber in motility 
imaging solution (BRB80 supplemented with 2mM DTT and 1.5mg ml" casein, 
2.5mM protocatechuic acid, 351g ml~! PCD, 2mM Mg.ATP). For each experi- 
ment, the protein concentration was adjusted until less than 30% of the tested beads 
exhibited any binding and motility when brought in contact with axonemes. This 
ensured that 95% of the beads were driven by a single processive motor complex. 
Trapping experiments were performed on a custom-built fully automated 
optical trap microscope setup”!. To generate stall force histograms, position data 
from trap recordings (5 kHz) were down-sampled to 500 Hz and then manually 
screened and selected. To qualify as a stall event, the position trace had to reach 
a plateau and remain stationary for at least 100 ms before full release from the 
microtubule (defined as a rapid greater-than-2 ms jump of at least 50 nm). For 
DDB, we measured n = 54 stalls from 14 beads in 4 independent experiments; for 
DDH, n= 118 stalls from 25 beads in 6 independent experiments; for DDR, n=53 
stalls from 17 beads in 5 independent experiments; and for kinesin, n = 83 stalls 
from 18 beads in 4 independent experiments. 
Statistical analysis. Statistical analysis was performed using GraphPad Prism 7 
(GraphPad). The statistical significance of difference in mean values was calculated 


ARTICLE 


using one-way ANOVA with Tukey’s multiple comparisons test, or an unpaired 
t-test, as indicated for each experiment. No statistical methods were used to 
predetermine sample size. Velocity and colocalization data were randomized before 
analysis; other experiments were not randomized, and the investigators were not 
blinded to allocation during experiments and outcome assessment. 

Code availability. Code used is available from A.P.C. 

Data availability. Cryo-EM maps have been deposited in the EMDB under 
accession codes EMD-4168 (whole TDR complex), EMD-4169 (N-terminal tail), 
EMD-4170 (C-terminal tail/HC C terminus), EMD-4171 (LIC region), EMD-4172 
(ROBLI region) and EMD-4177 (TDH complex). Coordinates are available from 
the RCSB Protein Data Bank under accession codes 6F1T (whole TDR complex), 
6F1U (N-terminal tail), 6F1V (C-terminal tail), 6FLY (LIC region), 6F1Z (ROBL1 
region), 6F38 (TDH complex) and 6F3A (TDB complex). Raw data are available 
from A.PC. 
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Extended Data Figure 1 | Single particle cryo-EM analysis of TDR 
and TDH. a, Cryo-EM reconstruction of the TDR complex analysed by 
ResMap™ and showing resolution distribution from 4 to 12 A. 

b, The gold-standard Fourier shell correlation (FSC) curve of the 6.5 A 
TDR map. c, Cryo-EM reconstruction of the TDH complex, showing 
resolution distribution from 4 to 12 A. d, The gold-standard FSC curve 
of the 6.7 A TDH map. e, Cryo-EM density for TDR low-pass filtered to 


6.7 A resolution (coloured according to cartoon) and to 20 A (transparent 
outline). Density at the N terminus of BICDR1 is boxed. f, Cryo-EM 
density for TDH low-pass filtered to 6.7 A (coloured according to cartoon) 
and to 20 A (transparent outline) showing the putative Hook domain, an 
extension of the HOOK3 coiled coil ending in extra density near dynein-B 
(dashed box). 
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Extended Data Figure 2 | Single-molecule assay speed distributions. in complex with dynactin and BICD2 (mtDDB, n = 905) or BICDR1 
a, A one-cumulative frequency distribution plot showing run-lengths (mtDDR, n= 1,183) (d); the colocalized mtDDR complexes containing 
of DDB, DDR and DDH, with fit to a one-phase exponential decay. The both TMR-dynein tail and Alexa Fluor 647-full-length dynein 
decay constant (run length) and R? value (least squares regression) of (tail-dynein, n = 939) or Alexa Fluor 647-only complexes containing 
the fit are shown. We measured 785, 677 and 684 events for DDB, DDH only full-length dynein (dynein-only, m = 1,004) (e); and all DDB 
and DDR, respectively, from microtubules of at least 20j1m in length complexes and complexes with both fluorophores, and hence two dyneins 
from three chambers. b-f, Distribution of mean velocities of processive (colocalizers, n = 660) (f). Mean + s.e.m. values were estimated by fitting 
(unidirectional, minus-end-directed) events for DDB (n = 3,343) and the histograms to a Gaussian distribution (dashed lines). 


DDR (n= 3,162) (b); DDB and DDH (n= 3,744) (c); active mutant dynein 
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Extended Data Figure 3 | Single-particle cryo-EM analysis of TDR 3 to 8A. d, The gold-standard FSC curve of the overall TDR map. e, Mesh 
complex at 3.5 A resolution. a, Micrograph of the TDR complex representation of 3.4 A resolution density map of « helixes from dynein-B1 
(representative of 26,906 micrographs). b, Typical 2D-class averages obtained by focused 3D classification and refinement. f, Sample density 
of TDR in different orientations. c, The overall density map of TDR obtained by local sub-volume averaging, showing {8 strands from IC 
was analysed by ResMap, showing a resolution distribution from WD40. 
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Extended Data Figure 5 | Secondary structure diagram of dynein HC. 
a, Secondary structure elements of dynein HC are matched against the 
primary sequence showing the NDD (purple) and the dynein helical 
bundles (blue; cyan; green; yellow; pale yellow; orange; red; pink). 

b, Secondary structure elements of IC. Extended N-terminal regions are 
coloured purple and other elements are coloured according to the blade of 


the WD40 domain to which they belong, except sheet 35, which associates 
with 830-32. c, Secondary structure elements of LIC, showing the globular 
domain helices and sheets (blue) and the two helices that pack against the 
HC (red). Jpred** secondary structure predictions of features not seen in 
the electron microscopy map are shown in grey. 
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Extended Data Figure 6 | Interactions between dynein subunits. a, The pack against the HC (density, coloured by bundle number). Dynein-A2 is 
dynein HC (yellow) interacts with the IC WD40 domain (blue) using shown. c, ROBLI (cartoon, light and dark pink) makes contacts with the 
bundles 4 and 5, with a helical segment (red cartoon) sitting in the WD40 IC N-terminal helices (cartoon, light and dark blue), which mediate the 
central cavity. Dynein-A2 is shown. Interacting residues are shown as interaction between ROBL] and the IC WD40 (surface). d, Representative 
sticks (bottom panel), with HC residues in red and IC residues in green. density from the 1.9 A resolution NDD crystal structure. e, Cartoon model 
b, Density map and model showing how the LIC (density and cartoon, of the NDD showing one chain in rainbow spectrum. 


blue) N- and C-terminal regions extend from the globular domain and 
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HC 625 675 690 725 890 1020 
1 1 1 1 1 1 1 

H. sapiens REJHIRGAIRY KFKVQYP ENHVE |]GDSFRMKLN RVRGRTGN |] MNLHSY FYRNALTRM 
M. musculus RIJHEIRGAIRWY KFKVQYP ENHV GDSFRMKLN RVRGRIGN || WNLHSY FYRNALTRM 
X. tropicalis RIJZHIRGAIRWY KFKVOYP ENHV GDSFRMKLN RARGRTGN || MNLHSY YYRNALTRM 
D. rerio RIQHIRGALIRIQY KFKVQYP ENHVE |]GDSFRAKLN RARGRIGT |] §NLHSY FYRNALTRM 
D. melanogaster RIJHIRGATRIAY KFKVOYP ETHIE ||GDSFRAKLS RSRIGRGN |] MSLROY TYRNLLTKL 
C. elegans REJRIRGAT YY RFTKAR. ENHEVD |]GDNFKVKLN QVDGRM.. mSLGNY TYHNILNVM 
U. maydis RIIKVRGAVOWY KFKAQYR ELYAE |]SETFRRKLD RAA...GN |] I@NLEGY DYASLLTRF 
A. nidulans REFKIRGAIGIWY RFKOQOYG HLATE SNLFRKKLD RAA...GN MN LENY CFADL.PQH 
S. cerevisiae CHIRIKVKVLIIN GLELL.. INDTLE JJIVOLRKEIN ]..... NED ]] WETKGL TFNSLVIKL 
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Ic 280 316 366 385, 
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H. sapiens APHEP KFHPN GTQNAH SLDMMSH 
M. musculus APHEP KFHPN GTQNAH SLDMMSH 
X. tropicalis APHEP RFHPN GTQNAH SLDMMSQ 
D. rerio APHEP QFHPN GTQNAH SLDMMST ‘ 
D. melanogaster SPNEP KFNPN GTQNAH SLDMMSQ Dynein-A2 
C. elegans IPNEP RFHAH GTKNAH NVDNMWTQ 
U. maydis NEDAP PFHPN GSANAN MLDMBBAR 
A. nidulans APHEP PFHPN GTQNAH TVDMMSO 
S. cerevisiae NASGR RYHPE SVQDQE DLTVIMRK 
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H. sapiens SVKEE VVEKCAV 
M. musculus SVKEE VVEKCAV 
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D. melanogaster SVKED VVEKCAV 
C. elegans SAKET VIERESI Dynein-B1 
U. maydis SSVRA TIDRECTL 
A. nidulans KPFHA FIERCKI 
S. cerevisiae CEDHT MVKRSEI 
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Extended Data Figure 7 | Dynein-dynein contacts and interactions at 
the BICDRI1 N terminus. a, Conservation diagram showing sequence 
similarity between A2 and B1 interacting residues. Residues coloured 
white with red background are completely conserved, whereas residues 
coloured red show sequence similarity at that position. Residues at each 
interaction site are numbered below the alignment (A2 residues in yellow 
circles, B1 residues in red circles). These numbers label the accompanying 
cartoon to show the dynein chains that constitute each interaction. 
Alignment generated by ESPript™ (http://espript.ibcp.fr). b, Intermediate 


chain interactions showing connections between the IC of Al and the HC 
of A2; the IC of A2 and the HC of B1; and the IC of B1 and the HC of B2. 
Interacting sites on each IC are shown as yellow spheres; sites on each HC 
are shown as red spheres. c, B1 (pink) contacts extra density (labelled, 
blue) adjacent to the BICDR1 coiled coil. The cartoon below shows the 
location of the area depicted (correspondingly coloured). d, Weak density 
connects the extra density with the LIC A2 helix 13 (blue). A cartoon 
representation of the area depicted is shown below. 
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BICDR1 
B1 
Extended Data Figure 8 | Comparison between different adaptors complex. b, Zoomed-in views of the barbed end of dynactin show that 
recruiting dynein. a, The TDR structure (left) is compared to models of BICD2 adopts an upwards position to contact ARP1A (grey), whereas 


TDH (middle) and TDB (right). Although the paths of BICDR1 (yellow), BICDR1 and HOOK3 adopt lower positions to bind dynein-B using the 
HOOK3 (magenta) and BICD2 (orange) vary along the surface of dynactin _ region coloured in red. The BICD2-ARP1A interaction site is highlighted 
(green surface), dynein-A HCs (light blue) bind at the same sites in each in purple. 
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Extended Data Table 1 | Cryo-EM data collection parameters of TDR and TDH structures and model refinement statistics of the 3.5A 


resolution TDR structure 


Data collection and processing 
Voltage (kV) 
Electron exposure (e-/A”*) 
Pixel size (A) 
Number of sessions 
Micrographs 
Symmetry imposed 
Final particle images (no.) 
Map resolution (A) 

FSC threshold 


Refinement 
Map 
Map resolution (A) 
FSC threshold 
Map sharpening B factor (A’) 
Map CC (around atoms) 
Model composition 
Non-hydrogen atoms 
Protein residues 
Ligands (ADP/ATP) 
R.m-s. deviations 
Bond lengths (A) 
Bond angles (°) 
Validation 
MolProbity score 
Clashscore 
Poor rotamers (%) 
Ramachandran plot 
Favored (“%) 
Disallowed (%) 
CB deviations (“%) 


N-terminal tail 


EMD-4169 
3.4 

0.143 

-50 

0.76 


28,871 
3,555 
0/0 


0.02 
1.59 


2.12 
11.18 
0.29 


89.83 
0.11 
0.00 


C-terminal tail 


EMD-4170 
3.4 

0.143 

-30 

0.70 


5,164 
628 
0/0 


0.01 
1.93 


2.21 
13.19 
0.36 


88.75 
0.16 
0.00 


TDH 


TDR 


EMD-4168 
8 

0.143 

0 

0.79 


92,789 
13,982 
9/1 


0.01 
1.60 


2.05 
9.08 
0.44 


89.13 
0.14 
0.00 


TDR(ordered) 


EMD-4168 


0.00 


The TDR_1 dataset is included in TDR_2 dataset. The N-terminal tail model consists of HC of A2 (residues 201-829), HC of B1 (residues 201-629), HC of B2 (residues 201-575), IC of A2 WD40 
domain, BICDR1 (residues 132-210), ARP1B, ARP1D, ARP1F, CAPZa and CAPZB. The C-terminal tail model consists of HC of A2 (residues 517-927) and HC of B1 (residues 453-702). The TDR 
(ordered) model consists of all parts of TDR for which density was seen. 
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Extended Data Table 2 | Crystal structure data collection parameters and model refinement statistics of the 1.9A resolution structure of the 


human dynein NDD 


Data collection 
Space group 
Cell dimensions 
a,b,c (A) 
a, B, y ©) 
Resolution (A) 


Runerge 

I/ol 
Completeness (%) 
Redundancy 


Refinement 
Resolution (A) 
No. reflections 
Ryork / Rrree 
B-factor, from Wilson plot (A) 
R.m.s. deviations 
Bond lengths (A) 
Bond angles (°) 


«Values in parentheses are for the highest-resolution shell. 


© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


NDD (PDB 5OWO) 
eee 


50.5, 101.8, 176.19 
90, 90, 90 

50.45-1.86 (1.94-1.86) * 
0.291(3.939) 

3.9(0.4) 

94.3(98.5) 

3.1 (2.9) 


1.80 

48901 
27.09/29.24 
25.7 


0.02 
2.15 
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A mildly relativistic wide-angle outflow in the 
neutron-star merger event GW170817 
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GW170817 was the first gravitational wave detection of a binary 
neutron-star merger’. It was accompanied by radiation across the 
electromagnetic spectrum and localized’ to the galaxy NGC 4993 
at a distance of 40 megaparsecs. It has been proposed that the 
observed y-ray, X-ray and radio emission is due to an ultra- 
relativistic jet launched during the merger, directed away from 
our line of sight®~°. The presence of such a jet is predicted from 
models that posit neutron-star mergers as the central engines 
that drive short hard -y-ray bursts”*. Here we report that the radio 
light curve of GW170817 has no direct signature of an off-axis 
jet afterglow. Although we cannot rule out the existence of a jet 
pointing elsewhere, the observed +-rays could not have originated 
from such a jet. Instead, the radio data require a mildly relativistic 
wide-angle outflow moving towards us. This outflow could be the 
high-velocity tail of the neutron-rich material dynamically ejected 
during the merger or a cocoon of material that breaks out when a 
jet transfers its energy to the dynamical ejecta. The cocoon model 
explains the radio light curve of GW170817 as well as the +-rays 
and X-rays (possibly also ultraviolet and optical emission)”-'5, and 
is therefore the model most consistent with the observational data. 
Cocoons may be a ubiquitous phenomenon produced in neutron- 
star mergers, giving rise to a heretofore unidentified population of 
radio, ultraviolet, X-ray and ~-ray transients in the local Universe. 

The radio discovery” of GW170817, as well as observations within 
the first month post-merger, were interpreted in the framework of 
classical off-axis jet, cocoon, and dynamical ejecta. We continued to 
observe GW170817 with the Karl G. Jansky Very Large Array (VLA), 
the Australia Telescope Compact Array (ATCA) and the upgraded 
Giant Metrewave Radio Telescope (uGMRT), spanning the frequency 
range 0.6-18 GHz, whilst optical and X-ray telescopes were constrained 
by proximity to the Sun. Our radio detections span up to 107 days 
post-merger (Figure 1 and Methods). These data show a steady rise in 
the radio light curve and a spectrum consistent with optically-thin syn- 
chrotron emission. A joint temporal and spectral power-law fit to these 
data of the form S x v“ t°. is well-described by a spectral index a=-0.6 
and a temporal index = +0.8 (see Methods). On 2017 November 18 
(93 days post-merger) the peak luminosity at 1.6 GHz was 2 x 10°” 
erg s’' Hz’', a luminosity undetectable for even the nearest short-hard 
+-ray burst (SGRB) afterglow discovered to date!®. 

The (sub-luminous) gamma-ray emission detected immediately 
after the gravitational wave detection'” must have been emitted by a 


relativistic outflow’, but an on-axis jet (scenario A in Figure 2) was 
ruled out by the late turn-on of the X-ray and radio emission? °!!"!3, 
If GW170817 produced a regular (luminous) SGRB pointing away 
from us, then the interaction of the jet with the circum-merger 
medium would have decelerated the jet, and the afterglow emission 
would have eventually entered into our line of sight, thus producing 
a so-called off-axis afterglow'*"°. For this geometry, the light curve 
rises sharply and peaks when the jet Lorentz factor > ~ 1/(9ops-9}), 
and then undergoes a power law decline (Qo; is the angle between 
the jet axis and the line of sight, and 0; is the jet opening angle). 
This behavior is clearly inconsistent with the full light curve shown 
in Figure 1. The rise is less steep than an off-axis jet and it is con- 
sistent with a monotonic increase without either a plateau or a sub- 
sequent decay. Initial off-axis models (based on available X-ray and 
radio data at the time) predicted a radio flux density>>? of ~10 wy 
(between 3 GHz and 10 GHz) ~100 days post-merger, while our 
measured values are at least a factor of five larger. The discrepancy 
with the off-axis jet model is further demonstrated in Figure 3 
where various jet and medium parameters are considered, showing in 
all cases a similar general light curve shape which cannot fit the data. 
We have considered a wide range in the phase space of off-axis models, 
and can rule out an off-axis jet (scenario B in Figure 2) as the origin of 
the radio afterglow of GW170817. We show below that even if we con- 
sider a “structured jet’, in which the outflow has an angular dependence 
of the Lorentz factor and energy (scenario E in Figure 2 represents one 
such configuration), the observed radiation arises predominantly from 
a mildly relativistic outflow moving towards us (at an angle less than 
1/7), and we do not detect the observational signature of a relativistic 
core within the structured jet. 

With a highly collimated off-axis jet ruled out, we next consider 
spherical or quasi-spherical ejecta components. A single spherical 
shell of expanding ejecta will produce a light curve that rises as 
S~t?. The light curve of GW170817 immediately rules out such a 
simple single-velocity ejecta model. The gradual but monotonic rise 
seen in our radio data (S «x t”*; Figure 1) points instead to on-axis emis- 
sion from a mildly relativistic blast wave where the energy is increasing 
with time (due to more mass residing in slower ejecta, which is seen at 
later times). For example, using canonical microphysical parameters 
(€g,=0.01, ¢.<=0.1), a density of 10° cm’ implies that between day 
16 to day 107 the blast wave decelerates from \~ 3.5 to y~ 2.5 and 
its isotropic equivalent energy increases from ~10” erg to ~10* erg. 
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On the other hand, a density of 0.01 cm™ implies a velocity range of 
0.8c to 0.65c and energy that rises from 108 erg to 10” erg. Figure 4 
shows that a quasi-spherical outflow with a velocity profile E(>8-y) 
x (By) provides an excellent fit to the data (see Methods), and it is 
almost independent of the assumed circum-merger density and micro- 
physical parameters. The energy injection into the blast wave during the 
time span of the observations (day 16 to day 107) increases its energy 
by a factor of ~10. The possible origin of the outflow depends on its 
energy and velocity. A faster and more energetic outflow, with y~ 2-3 
and energy of 10*°-10° erg, is a natural outcome of the cocoon driven 
by a wide-angle choked jet®!!"4 (scenario C in Figure 2). This scenario 
explains many of the puzzling characteristics of GW170817. First, the 
breakout of the cocoon from the ejecta can produce the observed 
sub-luminous gamma-ray signal, including its peak energy and 
spectral evolution! (see also Methods). Second, it provides a natural 
explanation for the high velocities of the bulk of the ejecta (~0.3c) 
and for the early bright UV and optical light'?'*"°. On the other hand, 
a slower and less energetic outflow, with 3 ~ 0.8-0.6 (y~ 1.67-1.25) 
and energy of 10*8-10* erg can arise from the fast tail of the merger 
ejecta*””’ (scenario D in Figure 2), although we note that this com- 
ponent cannot explain the gamma-ray signal (GRB 170817A) from 
GW 170817. These two scenarios can be easily distinguished by Very 
Long Baseline Interferometry or monitoring of the radio evolution on 
~ years timescale. 

A hidden jet, which does not contribute significantly to the observed 
afterglow, may still exist (scenario E in Figure 2), but its properties are 
tightly constrained. First, its edge must be far enough from the line-of- 
sight (= 10 degrees), which rules out off-axis gamma-ray emission as 
the source of GRB 170817A. Second, for every reasonable set of 
parameters, an off-axis jet would have been brighter than the fast tail 
of the ejecta, implying that the observed emission must be dominated 
by a7\~ 2-3 outflow (i.e. a cocoon) for the jet to remain undetected. In 
addition, the jet energy should, most likely, be much lower than that of 
the cocoon, which needs fine-tuning of the jet properties (see 
Methods). We therefore conclude from the lack of a signature from an 
off-axis jet, that the jet was likely choked (scenario C in Figure 2). 

We compared the 3 GHz radio and X-ray** detections obtained on 
2017 September 02-03 (15-16 days post-merger). The measurements at 
these two disparate frequencies imply a spectral index of -0.6, consistent 
with our multi-epoch, multi-frequency, radio-only measurements (see 
Methods and Extended Data Figure 4). It is therefore likely that the 
radio and X-rays originate from the same (synchrotron) source, viz. 
a mildly relativistic outflow. This common origin can be confirmed if 
the X-ray flux continues to rise in a similar manner as the radio. We 
also highlight that, while at early times the cooling break will lie well 
above the soft X-ray frequencies, beyond ~10*-10° days post-merger 
this break may be seen moving downwards in frequency within the 
electromagnetic spectrum. If the cooling break stays above 10!* Hz, 
the common origin of the radio and X-rays implies that the Chandra 
telescope will detect a brighter X-ray source (flux between 0.7x10"!* and 
5.2x10°!4 erg cm™ s"! in the 0.3-10 keV band; see Methods) during its 
observation of GW170817 on December 03-06 (note: subsequent to the 
submission of this paper, the X-ray observations took place and confirmed 
our prediction). If a different spectral index is derived from these X-ray 
observations relative to the in-band radio spectral index presented here, 
or indeed at any time within ~1000 days of the merger, it will indicate 
that the cooling break has already shifted below the X-ray band, which 
would favor the fast tail of the merger ejecta as the common source of 
the X-ray and radio emission (see Methods). 

The confirmation of a wide-angle outflow in GW170817 bodes well 
for electromagnetic counterpart searches of future gravitational wave 
events. Although on-axis (and slightly off-axis; 0.4;<20 degrees) jets 
produce bright panchromatic afterglows, they represent only a small 
fraction (~10%) of the gravitational wave events (factoring in the 
larger detectable distance for face-on events”*). In contrast, the emis- 
sion from wide-angle cocoons*"! will be potentially seen in a much 
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larger fraction of events, and at virtually all wavelengths, thus increas- 
ing the probability of the detection of electromagnetic counterparts. 
The radio emission from the cocoon, evolving on timescales of weeks 
to months, especially provides a distinct signature (as opposed to the 
more common supernovae and AGN transients) and diagnostics for 
observers. Specifically in the case of GW170817, continued monitoring 
of the radio light curve will provide an independent constraint on the 
circum-merger density and thereby the properties of the blast wave that 
dominated the early-time radio emission. 

Our radio data support the hypothesis of a choked jet giving rise to a 
mildly relativistic cocoon (scenario C in Figure 2), but this is only one 
of the possible outcomes of neutron star merger events (see Figure 2). 
In some cases, the jet may break out after depositing a fraction of its 
energy into the cocoon, thereby still successfully producing a SGRB"! 
(scenario E in Figure 2). Indeed, a plateau in the distribution of SGRB 
durations has been highlighted as evidence that SGRB jets often prop- 
agate through slower traveling ejecta before breakout and at times it 
is choked“. The relative fractions of neutron star mergers that suc- 
cessfully produce a SGRB or a choked jet can be directly probed via 
radio follow-up of a sample of neutron star mergers in the upcoming 
LIGO- Virgo campaigns. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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Figure 1 | The radio light curve of GW170817. Panel (a): The flux The fit to the light curve with the temporal index 6 = 0.78 (see Methods) 
densities corresponding to the detections (markers with lo error bars; is shown as a red line and the uncertainty in 6 (+/-0.05) as the red shaded 
some data points have errors smaller than the size of the marker) and region. Panel (c): Residual plot after correcting for the spectral and 
upper limits (markers with downward-pointing arrows) of GW170817 at temporal variations. The observing frequencies are color coded according 
frequencies ranging from 0.6-15 GHz between day 16 and day 107 post- to the colorbar displayed at the right (black for <1 GHz and yellow for 
merger (ref. 12 and Extended Data Table 1). Panel (b): Same as the panel >10 GHz). The marker shapes denote measurements from different 
(a) but with flux densities corrected for the spectral index a =-0.61 telescopes. 


(see Methods) and early-time, non-constraining, upper limits removed. 
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Figure 2 | See next page for caption. 
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Figure 2 | Schematic illustration of the various possible jet and 
dynamical ejecta scenarios in GW170817. A) A jet seen on-axis, 
generating both the low-luminosity gamma-rays and the observed radio 
afterglow. This scenario cannot explain the late rise of the radio emission. 


It is also unable to explain’! how a low-luminosity jet penetrates the ejecta. 


It is therefore ruled out. B) A regular (luminous) SGRB jet seen off-axis, 
producing the gamma-rays and the radio. The continuous moderate rise 
in the radio light curve rules out this scenario. C) A choked jet giving 
rise to a mildly relativistic (-) ~~ 2-3) cocoon which generates the gamma- 
rays and the radio waves via on-axis emission. This is the model that is 
most consistent with the observational data. It accounts for the observed 
gamma-rays, X-rays (possibly also the ultraviolet and optical emission) 
and the radio emission, and provides a natural explanation for the 

lack of an off-axis jet signature in the radio. D) The fast velocity tail 

(8 ~0.8-0.6c, ie. y~ 1.67-1.25) of the ejecta produces the radio emission. 
In this case, the jet must be choked (otherwise its off-axis emission should 
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have been seen). While the radio emission is fully consistent with this 
scenario, the energy deposited in faster ejecta (~~ 2-3) must be very 
low. In this scenario, the source of the observed gamma-rays remains 
unclear. E) A successful jet that drives a cocoon but does not have a clear 
signature in the radio. The cocoon generates the gamma-rays and the 
radio emission, and outshines the jet at all wavelengths. This scenario is 
less likely based on theoretical considerations, which suggest that the jet 
and the cocoon should have comparable energies, in which case the jet 
signature would have been observed in the radio band. This scenario can 
also be visualized as a “structured” jet, having a relativistic narrow core 
surrounded by a mildly relativistic wide-angle outflow, in which an off- 
axis observer does not see any signature of the core. The relativistic core 
could have produced a regular SGRB for an observer located along the axis 
of the jet. Such a jet, if it exists, could be too weak (made a sub-dominant 
contribution to the radio light curve early on) or too strong (such that its 
radio and X-ray signatures will be observed in the future; see Methods). 
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O Radio data (3 GHz) 
me 6)=10°,n=2.5 x 104 cm-?, Epo =6 x 10°" erg 


ma O)=20°, n=10~° cm >, Ejso = 1.5 X 10°! erg 


mms O)=15°, n=10-4 cm-?, Eiso = 3 X 10°° erg (from Margutti et al. 2017) 


100 
S 
= 
> 
Ww 
10 
Figure 3 | Off-axis jet models. Synthetic light curves with a range of jet none of the models give a good fit to the observational data, and hence we 
opening angles 0, isotropic-equivalent energy Ejg, and the ISM density n rule out the classical off-axis jet scenario as a viable explanation for the 
(see Methods) overplotted on the 3 GHz light curve (error bars are 1o; radio afterglow. The dashed black and dotted red curves are calculated 
ref. 12 and Extended Data Table 1). The overall shape of the light curve using the codes described in the Methods. The dashed-dot blue curve is 
remains unchanged even after changing these parameters. We have taken from figure 3 of ref. 4 (scaled to 3 GHz using « = -0.6). All off-axis 
considered a wide range of parameters in the phase space of off-axis models assume Oop; = 26 deg, €e = 0.1, €g =0.01 and p =2.2. (see main text 
models (including unlikely scenarios like n= 10° cm; see Methods); and Methods). 
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O Radio data (3 GHz) 

m= = Bmax=0.8, E(>By)=5 x 10°° (By/0.4)->, n=0.03 cm73, €g=0.003 

mmm Vmax=3.5, E(>By)=2 X 10°! (By)->, n=8 x 107° cm-3, Eg=0.01 
1004 |=" Cocoon model from Gottlieb et al. 2017 
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Figure 4 | Quasi-spherical ejecta models. Radio light curves arising from curve is consistent with a profile E(>8+) « (8+). For n~ 0.03 cm%, the 
quasi-spherical ejecta with velocity gradients, overplotted on the observed radio emission at 93 days is produced by an ejecta component 
3 GHz data spanning days 16-93 post-merger (filled yellow circles; error with a velocity of ~0.6c and kinetic energy of ~10” erg. For a lower 
bars are lo; ref. 12 and Extended Data Table 1). The solid red and dashed ISM density, ~ 10°* cm’, the radio emission at 93 days is produced by a 
blue light curves represent power law models with maximum Lorentz component with a velocity of 0.9c and energy 10*° erg. Parameters €,=0.1 
factors y= 3.5 and \ = 1.67 respectively (i.e. maximum 8 = v/c = 0.96 and and p = 2.2 are used for both models. Also shown for reference is the 
0.8 respectively). These curves approximately correspond to the cocoon cocoon model light curve (dotted black curve) taken from ref. 14, where 
and dynamical ejecta, respectively. The shallow rise of the radio light parameter values n= 1.3x10“* cm™®, eg =0.01, ¢-=0.1 and p=2.1 are used. 
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METHODS 

1. Radio Data Analysis. VLA. Radio observations of the GW170817 field were 
carried out with the Karl G. Jansky Very Large Array in its B configuration, under 
a Director Discretionary Time (DDT) program (VLA/17B-397; PI: K. Mooley). 
All observations were carried out with the Wideband Interferometric Digital 
Architecture (WIDAR) correlator in multiple bands including L-band (nominal 
center frequency of 1.5 GHz, with a bandwidth of 1 GHz), S-band (nominal center 
frequency of 3 GHz, with a bandwidth of 2 GHz), and C-band (nominal center 
frequency of 6 GHz, with a bandwidth of 4 GHz). We used QSO J1248-1959 
(L-band and S-band) and QSO J1258-2219 (C-band) as our phase calibrator 
sources, and 3C 286 or 3C 147 as flux density and bandpass calibrators. The data 
were calibrated and flagged for radio frequency interference (RFI) using the VLA 
automated calibration pipeline which runs in the Common Astronomy Software 
Applications package (CASA”°). We manually removed further RFI, wherever 
necessary, after calibration. Images of the observed field were formed using the 
CLEAN algorithm (with the “psfmode” parameter set to Hogbom”®), which 
we ran in the interactive mode. The results of our VLA follow-up campaign of 
GW170817 are reported in Extended Data Table 1, and the image cutouts are 
shown in Extended Data Figure 1. The flux densities were measured at the Gaia/ 
HST position’. Flux density measurement uncertainties denote the local root- 
mean-square (rms) noise. An additional 5% fractional error on the measured flux 
density is expected due to inaccuracies in the flux density calibration. For non- 
detections, upper-limits are calculated as three times the local rms noise in the 
image. 

ATCA. We observed GW170817 on 2017 November 01, November 18 and 
December 02 using the Australia Telescope Compact Array (ATCA) under a 
target of opportunity program (CX391; PI: T. Murphy). During these observations 
the array was in configurations 6A, 1.5C and 6C respectively. We observed using 
two 2 GHz frequency bands with central frequencies of 5.5 and 9.0 GHz. For both 
epochs, the flux scale and bandpass response were determined using the ATCA 
primary calibrator PKS B1934-638, and observations of QSO B1245-197 were used 
to calibrate the complex gains. The visibility data were reduced using the standard 
routines in the MIRIAD environment”. The calibrated visibility data were split 
into the separate bands (5.5 GHz and 9.0 GHz), averaged to 32 MHz channels, and 
imported into DIFMAP”’. Bright field sources were modeled separately for each 
band using the visibility data and a combination of point-source and Gaussian 
components with power-law spectra. With the field sources modelled and sub- 
tracted from the visibility data, the dominant emission in the residual image was 
from GW170817. Restored images for each band were generated by convolving 
the model components with the restoring beam, adding the residual map and then 
averaged to form a wide-band image. Image-based Gaussian fitting for an unre- 
solved source was performed in the region of GW170817, leaving the flux density 
and source position unconstrained. The source position from the fitting agrees 
with the Gaia/HST position?” of GW170817. The measured radio flux densities 
in the combined images are reported in the Extended Data Table 1, and the image 
cutouts are shown in Extended Data Figure 1. 

GMRT. We carried out observations of the GW170817 field with the upgraded 
Giant Meterewave Radio Telescope (uGMRT) at 700 MHz under a DDT program 
(DDTB288; PI: K. De). All observations were carried out with 400 MHz bandwidth 
centered at 750 MHz using the non-polar continuum interferometric mode of 
the GMRT Wideband Backend (GWB*). Pointings were centered at the location 
of the optical transient. 3C 286 was used as the absolute flux scale and bandpass 
calibrators, while phase calibration was done with the sources J1248-199 (for the 
2017 September 16 observation) and 3C 283 (for all other observations). These data 
were calibrated and RFI flagged using a custom-developed CASA pipeline. The 
data were then imaged interactively with the CASA task CLEAN, incorporating a 
few iterations of phase-only self-calibration by building a model for bright sources 
in the field with each iteration. The GMRT flux density measurements at the Gaia/ 
HST position?’ are reported in the Extended Data Table 1. The image cutouts are 
shown in Extended Data Figure 1. 

1.1 Radio Data Power-law Fit. We carried out a least-squares fit to the assembled 
radio data as a function of time and frequency, using a two-dimensional power-law 
model: 


S(v, t) = So(v/vo)°(t/to)® 


The fit results are shown in Extended Data Figure 2, where we find good results 
for a= -0.61+/-0.05, 5=0.78+/-0.05, Sp = 13.1+/-0.4 Jy, vo = 3 GHz and to = 10 
d. The fit has x” = 42.3 for 44 degrees-of-freedom, although there are only 27 
detections among the 47 data-points. 

1.2 Multi-epoch radio spectra. In Extended Data Figure 3 we show the radio con- 
tinuum spectra obtained at different epochs. All epochs are individually consistent 
with the spectral index «= -0.61 within lo. 
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2. Model Descriptions. 2.1 Off-axis afterglows. The radio light curves were cal- 
culated using two independent semi-analytic codes*”, which are based on similar 
approximations. Both codes were compared to, and have been found to be largely 
consistent with, the light curves produced by the BOXFIT code*’. In short, both 
codes approximate the jetted blast wave at any time in the source-frame as a single 
zone emitting region which is a part of a sphere with an opening angle, 0;. The 
hydrodynamics includes the shock location and velocity, and the jet spreading. The 
hydrodynamic variables in the emitting region are set to their values immediately 
behind the shock. The emission from each location along the shock is calculated 
using standard afterglow theory*4, where the microphysics is parameterized by the 
fraction of internal energy that goes to the electrons, €., the fraction of internal 
energy that goes to the magnetic field, ep, and the power-law index of the electron 
distribution. The code calculates the rest frame emissivity at any time and any 
location along the shock and the specific flux observed at a given viewing angle 
at a given time and frequency is then found by integrating the contribution over 
equal-arrival-time surfaces, with a proper boost to the observer frame. 

2.2 Quasi-spherical ejecta. Radio light curves arising from quasi-spherical 
outflows, e.g., a cocoon and the tail of the dynamical ejecta, are approximately 
described by a model with a single one-dimensional velocity profile: E(>8-) 
x (84)S where ( is a velocity in units of the speed of light and + is a Lorentz factor. 
The slope of the observed radio light curve is consistently explained with k=5. 
The light curves are calculated using the same codes as in section 2.1. In Figure 4, 
we show two cases: (1) a cocoon model, E(>87y) =2 x 10°! (8y)° erg with a maxi- 
mum Lorentz factor of 3.5, n= 8x10" cm™, and eg =0.01, and (2) a dynamical 
ejecta model, E(>8-y) =5 x 10°° (3+4/0.4)° erg with a maximum velocity of 0.8c, 
n=0.03cm%, and eg =0.003. This velocity profile of the dynamical ejecta contains 
a larger mass traveling faster than 0.6c by a factor of ~5 compared with that found 
in general relativistic numerical simulations””?!. The small amount of mass ejected 
at these high velocities is plausible since the simulations are affected by finite 
resolution and artificial atmosphere. In addition, Figure 4 shows a prediction from 
the full 2D simulation of a choked jet and the resulting cocoon presented in ref. 9. 
The light curve is taken from figure 4 of ref. 14 without any attempt to fit the radio 
data that was added since it was published. A more detailed publication reporting 
the full set of 2D simulations is in preparation. Finally, an upper limit on the ISM 
density’? of 0.04cm* suggests that the ejecta contains a fast moving component 
with v= 0.6c. For all the models shown in Figure 4, the mass of the ejecta that 
produces the radio signal up to 93 days is only ~10 Mo. This velocity is faster, 
and the mass is much lower, than those inferred from the kilonova emission*. We 
note that kilonova ejecta will produce observable radio signals on a timescale of 
years. 

3. Hiding an off-axis jet. Hiding a luminous off-axis jet (of the type seen in regu- 
lar SGRBs), given the radio data, is not trivial. First, the jet emission peaks once its 
Lorentz factor drops to ~1/(0ops-9;), where 9ops is the viewing angle with respect to 
the jet axis and 0, is the jet opening angle. Thus, emission from a jet that points only 
slightly away from us (<10 degrees), will peak when its Lorentz factor is high (= 
6). Since the flux in the radio at a given time is extremely sensitive to the blast wave 
Lorentz factor (roughly as ~y!°) a jet at that angle will be much brighter than any 
on-axis mildly relativistic outflow around the peak, even if the outflow carried 
much more energy than the jet. Therefore, a hidden jet must be far away from the 
line-of sight, namely 6,ps-; 2 10 degrees. At such angle, any gamma-ray signal 
produced by a relativistic jet will be too faint to explain the observed gamma-ray 
signal!!, Thus, while our previous radio observations strongly disfavored a regular 
SGRB seen off-axis as the origin of the gamma-rays!” (scenario B in Figure 2), the 
additional observations presented here practically rule this out. 

The extreme dependence of the radio flux density on the blast wave Lorentz 

factor also implies that, for reasonable parameters also at 0,ps-0; 2 10 degrees, off- 
axis jet emission will outshine a blast wave driven by material with 3B ~ 0.8 
(y~ 1.67). Thus, the radio emission from an off-axis jet may remain undetected 
only if the observed emission is dominated by an on-axis material with ~ 3, which 
is most likely a cocoon. In that case, a jet that is far from the line of sight may be 
hidden in two ways, either by being significantly less energetic than the on-axis 
outflow or, surprisingly, by being significantly more energetic (scenario E in Figure 
2). 
In the latter case the jet emission will not appear in the radio data available so far 
if it is so energetic that its Lorentz factor at day 93 is still significantly larger than 
Qobs-4). For example, a 10 degree jet with an isotropic equivalent energy of 10° erg, 
that propagates in circum-merger density of 10“ cm® and observed at an angle of 
30 degrees, peaks after 200 days and its brightness is comparable to the observed 
data only around day 90 (eg =0.01, €.=0.1). While we cannot rule out this option, 
the extreme jet energies make it unlikely, but if this is the case then we will see the 
jet contribution in the future. 

The other possibility, that the jet is less energetic than the on-axis outflow (again 
scenario E in Figure 2), cannot be tested observationally. However, it is unlikely 
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based on theoretical considerations. The energy of the cocoon is distributed 
over a large range of velocities. Thus, the energy of the mildly relativistic ejecta 
(-y~3) is expected to be only a small fraction of the total cocoon energy’. Moreover, 
observationally we see that the energy carried by slower moving on-axis material is 
at least a factor of 10 larger than energy carried by high velocity on-axis material. 
Now, the ratio between the total energy in the cocoon and the energy in the jet 
depends on the ratio between the time spent by the jet in the ejecta before it breaks 
out and the time over which the jet launching continues after the breakout takes 
place. The engine that launches the jet is not affected by the propagation of the jet 
though the ejecta and is causally disconnected from the jet head, if and when it 
breaks out of the ejecta. Therefore, there is no reason for the engine to stop upon 
breakout and without fine tuning. If the jet breaks out successfully the launching 
of the jet is expected to continue over a time that is comparable to or larger than 
the time it takes for the jet to cross the ejecta. As a result, the energy in the jet is 
expected to be comparable or larger than that in the cocoon. Thus, it is highly 
unlikely that the jet is less energetic than the fastest cocoon material, which as noted 
above carries only a small fraction of the total cocoon energy. 

We therefore conclude that there are no probable scenarios in which a jet 
successfully breaks out, producing an SGRB seen by another (non-Earth) observer, 
and remains undetected by our radio observations. We find the case in which the 
jet is choked as the one that provides the best explanation to entire set of obser- 
vations available to date. 

4. The origin of the gamma-rays. Since a hidden jet cannot produce the observed 
gamma-rays and the rising radio emission indicates a mildly relativistic wide-angle 
outflow moving towards us, we can expect that this outflow is also the origin of the 
gamma-rays. We do not see any plausible scenario in which the kilonova ejecta can 
produce the gamma-rays by itself. Compactness arguments imply that this ejecta is 
too slow!“ and there is no natural dissipation process that can convert the kinetic 
energy of the ejecta to gamma-rays. The cocoon, on the other hand, can produce 
the gamma-rays. It has sufficient energy and its Lorentz factor is sufficiently high 
to avoid compactness issues, so in the presence of a dissipation mechanism it can 
produce the observed gamma-rays”!°3®°7, For example, a breakout of the shock 
driven by the cocoon through the expanding ejecta can produce the observed 
signal, accounting for its luminosity, duration, peak energy and spectral evolution?. 
5. Lower limit to the circum-merger density. The mean cosmological baryon 
density is a function of the D/H ratio*’, primordial Helium density”, cosmographic 
parameters” and the fraction of diffuse baryons in the IGM (ficm) and is given*! as 


ny ~ (1.88x10-7em~)figm(1 + z)3 


We adopt ficm=0.7. At z~ 0, a density of 10° cm™ corresponds to a baryon over 
density A, =5. For the Lyman-alpha forest, Aj is in the range of 10-50, whereas 
that in condensed halos*! is 107 < Ay < 104. Thus, in the case of GW170817, a 
lower limit to the ambient density is 2x10° cm™ and a typical value” would be 
~104%cm®. 

6. Radio-X-ray comparison. The 3 GHz flux density measured!” on 2017 
September 03.9 is 15+-/-4 \1Jy. Scaling the X-ray fluxes given in ref. 6 (reported 
in the energy range 0.3-8 keV) to the values reported in ref. 5 (0.3-10 keV), we 
estimate the X-ray flux on 2017 September 02.2 as 5.5x 10° erg cms", witha 
lo uncertainty of ~1.5 x 10° erg cm? s"!. We use this information (X-ray flux 
density is 0.23+-/-0.06 nJy at a nominal center frequency of 4x10'” Hz) to calculate 
the spectral index between the radio and X-ray frequencies as -0.60+/-0.03. This is 
consistent with our estimated value of the radio-only spectral index, -0.61+/-0.05, 
within lo. Therefore the radio emission and X-rays likely originate from the 
same source, and the cooling frequency ~16 days post-merger is well above the 
soft X-ray frequencies. Extended Data Figure 4 shows a panchromatic spectrum 
between the radio and X-ray frequencies. Ultraviolet and near-infrared data are 
also plotted for comparison. Although the early-time emission in the ultraviolet, 
optical and infrared frequencies was dominated by thermal emission, at late times 
there should be a significant synchrotron component. Using the temporal and 
spectral indices estimated for the radio-only data (earlier in the Methods section), 
and assuming the cooling break remains beyond 10!® Hz, we can predict the X-ray 
flux densities between 0.3-2.2 nJy (flux between 7x10 to 52x10" erg cm? sin 
the 0.3-10 keV band) on 2017 November 18 (and also for the Chandra observation 


on December 03-06). We note that, subsequent to the submission of this paper, 
the X-ray observations took place and confirmed our prediction. We estimate the 
synchrotron cooling frequency as: 


For y > > 1 (as expected for cocoon): 


a= 3 sa t —2. 
Uk 7X 109 Hz [3] is eh 
2 10-4 cm-3 0.01 100 days 


For B<<1 (i.e. y~ 1; as expected for the dynamical ejecta tail): 


aa - 
Vex 2x 10!8 Hz | z= le s |" 
0.6 0.03 cm=3 0.003 100 days 


We see that the cooling frequency at day ~16 post-merger is much larger than 
10'8 Hz, while beyond ~107-10° days post-merger this break should be seen 
moving towards lower frequencies within the electromagnetic spectrum. 

Data availability. All relevant data are available from the corresponding author 
on request. Data presented in Figure 1 are included in Extended Data Table 1. 
Code availability. The codes used for generating the synthetic radio light curves 
are currently being readied for public release (publication in preparation). Radio 
data processing software: CASA, MIRIAD, DIFMAP. 
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Extended Data Figure 1 | GW170817 radio image cutouts. Image cutouts __ the data reported in ref. 12. Panels (d), (e) and (f) show recent data, from 
(30” x 30”) from the upgraded GMRT, the VLA and the ATCA centred on October 2017. The flux density scale is denoted by the colorbar in each 
the NGC 4993. The position of GW170817 is marked by two black lines. column. The synthesized beam is shown as an ellipse in the lower right 
Panels (a), (b) and (c) show images from August-September 2017, using corner of each image. 
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temporal indices. Joint confidence contours for « (the spectral power-law a= -0.61+/-0.05, 5=0.78+/-0.05,is indicated by the red “x” marker. 
index) and B (the temporal power-law index). The contours are 1-, 2-, and 
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Extended Data Figure 3 | Radio-only spectral indices of GW170817. and spectral indices are given in the legend. Error bars are 1o. The joint 
Radio spectral indices between 0.6-15 GHz spanning multiple epochs. analysis of all radio data (see text in Methods section) implies 


The different epochs are color coded. The corresponding days post-merger a=-0.61+/-0.05. 
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Extended Data Figure 4 | Comparison between the radio and X-ray flux 
densities of GW170817. The comparison of the X-ray data with the radio 
upper limits (arrows) and detections (filled circles) at different epochs. 
Error bars are 1o. The epochs: 2017 August 19, August 26-28, September 
02-03 and November 18 (2 days, ~10 days, ~15 days and 93 days post- 
merger respectively) are color coded (the epoch is given in the legend to 
the upper-right corner) and marked with different symbols. The spectral 
index (c) and corresponding electron power law index (p; assuming 
cooling frequency is beyond 10!% Hz, as expected for a mildly relativistic 
outflow) between 3 GHz and 10!* Hz as derived from the September 02-03 


v (Hz) 


data (a = -0.60+/-0.03 and p = 2.20+/-0.06) are consistent with the radio- 
only spectral indices, and shown here as a dashed grey line. This indicates 
that the radio and X-rays originate from the same synchrotron source. The 
corresponding predicted soft-X-ray flux density on November 18 (between 
0.3-2.2 nJy; note: the Chandra X-ray observations from 03-06 December, 
reported after the submission of this paper, confirmed the prediction) is 
shown as a magenta unfilled circle with an error bar. The flux densities in 
the ultraviolet (~10'° Hz) and near-infrared (~10'4 Hz), dominated by 
thermal emission at early times, are shown for reference. 
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Extended Data Table 1 | Radio data for GW170817 


UT Date AT Telescope v Bandwidth ee 

(d) (GHz) (GHz) (pJy) 
Sep 16.25 29.73 GMRT 0.68 0.2 < 246 
Sep 17.84 31.32 VLA 3 2 34 + 3.6 
Sep 21.86 35.34 VLA 15 1 44+10 
Sep 25.86 39.34 VLA 15 4 <14.4 
Oct 02.79 46.26 VLA 3 2 44+4 
Oct 09.79 53.26 VLA 6 4 32+4 
Oct 10.80 54.27 VLA 3 2 48+6 
Oct 13.75 5/22 VLA 3 2 61+9 
Oct 21.67 65.14 GMRT 0.61 0.4 117 + 42 
Oct 23.69 67.16 VLA 6 4 42.6+4.1 
Oct 28.73 72.20 VLA 4.5 0.5 54.6+5.5 
Nov 01.02 75.49 ATCA 1.25 4 44.9+5.4 
Nov 17.93 92.4 ATCA 125 4 39,6 +7 
Nov 18.60 93.07 VLA 1.6 1 98+ 14 
Nov 18.66 93.13 VLA 3 re 1045.7 
Nov 18.72 93.19 VLA 15 4 18.6+3.1 
Dec 02.89 107.36 ATCA 1.20 4 66.5 + 5.6 


Table notes: AT represents the time post-merger. The Nov 17.93 ATCA observation was affected by bad weather, and the uncertainty in the flux density is expected to be much larger than the one 
reported here. 
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Magnetic cage and rope as the key for 


solar eruptions 


Tahar Amari!, Aurélien Canow?, Jean-Jacques Aly’, Francois Delyon‘ & Fréderic Alauzet® 


Solar flares are spectacular coronal events that release large amounts 
of energy. They are classified as either eruptive or confined’, 
depending on whether they are associated with a coronal mass 
ejection. Two types of model have been developed to identify the 
mechanism that triggers confined flares, although it has hitherto not 
been possible to decide between them because the magnetic field at 
the origin of the flares could not be determined with the required 
accuracy*®, In the first type of model, the triggering is related to 
the topological complexity of the flaring structure, which implies 
the presence of magnetically singular surfaces”"'. This picture 
is observationally supported by the fact that radiative emission 
occurs near these features in many flaring regions!*-!”. The second 
type of model attributes a key role to the formation of a twisted 
flux rope, which becomes unstable. Its plausibility is supported by 
simulations'*’, by interpretations of some observations” and by 
laboratory experiments”°. Here we report modelling of a confined 
event that uses the measured photospheric magnetic field as input. 
We first use a static model to compute the slowly evolving magnetic 
state of the corona before the eruption, and then use a dynamical 
model to determine the evolution during the eruption itself. We 
find that a magnetic flux rope must be present throughout the entire 
event to match the field measurements. This rope evolves slowly 
before saturating and suddenly erupting. Its energy is insufficient 
to break through the overlying field, whose lines form a confining 
cage, but its twist is large enough to trigger a kink instability, leading 
to the confined flare, as previously suggested!*!°. Topology is not 
the main cause of the flare, but it traces out the locations of the 
X-ray emission. We show that a weaker magnetic cage would have 
produced a more energetic eruption with a coronal mass ejection, 
associated with a predicted energy upper bound for a given region. 

It is important to study the pre-eruptive magnetic structure of the 
active region before a solar eruption (which must be temporally and 
spatially isolated) with sufficient precision to capture detailed features 
such as the presence of a twisted flux rope (TFR). It is equally impor- 
tant to have an efficient magnetohydrodynamic model that is able to 
describe the eruptive phase and to identify the underlying triggering 
mechanism—for instance, the ideal magnetohydrodynamic instability 
of a TFR. The problem is notoriously difficult and previous attempts 
have not provided a clear answer to the basic questions*-*. We have 
been able, however, to develop such efficient tools, and we apply them 
here to a specific flare-producing active region of the Sun. 

Our target is the National Oceanic and Atmospheric Region (NOAA) 
active region AR12192, which was visible at the surface of the Sun 
in October 2014. This region is interesting because it exhibited huge 
spots, making it the largest region observed during more than two solar 
cycles (it was as large as Jupiter) and also the most active one of the 
current solar cycle”® (the 24th since 1755). However, it produced nei- 
ther coronal mass ejections nor solar energetic particles, which seemed 
puzzling**. But it was the site of very intense X-class flares, which were 


observed by the NASA Solar Dynamics Observatory (SDO) satellite 
and disturbed communication systems. Here we focus on the intense 
X3.1 flare that occurred at 21:10 uT (universal time) on 24 October 
2014, a day the region was particularly active. One of the rationales 
for this choice is that the filtergram of the Helioseismic and Magnetic 
Imager (HMI) onboard SDO provided a series of high-precision meas- 
urements of the photospheric field around the time of the eruption. 
This prompted other groups also to model this flare, but the configura- 
tions they obtained fundamentally differ from the ones reported below. 
For instance, a reconstruction of the pre-eruptive magnetic structure® 
shows only the presence of a weakly twisted tube overlying a strongly 
sheared arcade, while a fully data-driven picture? (including both the 
pre-eruptive and eruptive phases) does not show the formation of any 
TER. Therefore it is not surprising that both papers propose an inter- 
pretation of the X3.1 flare—based on a tether-cutting reconnection 
process—that is at odds with ours. 

X-ray flux data from the Geostationary Operational Environmental 
Satellite (GOES) indicate that between 10:00 ut and 21:00 uT the only 
eruptive activity is a small C5-class flare occurring around 14:00 uT. 
This point was confirmed by checking with all wavelength data from 
the Atmospheric Imaging Assembly (AIA) instrument. To be sure to 
use data not affected by previous activity, however weak, we decided 
to restrict our attention to the quiet period running from 16:00 uT to 
21:00 uT on 24 October. In a first step, during that time interval we 
made a series of static reconstructions of the environment of the active 
region at the global Sun scale. Our main tool is the code MESHMHD”’, 
which we fed with composite SDO/HMI data (Fig. 1) prepared as 
explained in Methods. MESHMHD isa very accurate numerical code 
that can be used to treat both static and dynamic problems and has 
the characteristic of being able to achieve high resolution only where 
it is needed—inside the active region*® at small scales. Our static 
reconstruction neglects the effects of the plasma and gravity forces, 
a common approximation that is well justified in the low corona. 
The computed equilibrium state is to a high degree force-free and 
divergence-free, which is crucial for solving the present problem and 
allows an accurate computation of important quantities, such as the 
magnetic helicity (see Methods). 

Our computation of a sequence of equilibria using this method 
shows (Fig. 2) that the active region configuration evolves slowly, 
revealing the progressive appearance of a TFR. The TFR is fully 
formed at 21:00 ut, where it reaches its maximum size (Figs 2 and 3). 
It is strongly confined inside an overlying magnetic cage, whose 
magnetic lines are not parallel to its axis (Fig. 3c) and which plays an 
important part in the evolution of the configuration, in agreement 
with recent observations”. The magnetic twist of the TFR increases 
continuously (Extended Data Fig. 1). For the configuration obtained 
at 21:00 uT, that is, ten minutes before the major confined X3.1 flare, 
the twist has a value of about 2.5 turns (Extended Data Fig. 1). The 
structure of the electric current (and then of the force-free function a) 
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Figure 1 | Building magnetic input for the model. SDO/HMI high- 
resolution photospheric magnetic maps are used to build up adapted 
resolution boundary conditions. B, denotes the radial component of the 
magnetic field. The composite map contains three different maps: a, the 
synoptic map of the Carrington Rotation 2156; b, the line-of-sight full- 
disk magnetogram in the Carrington heliographic coordinates (CHC); and 
c, the vector magnetogram patches from the Spaceweather Helioseismic 
and Magnetic Imager Active Region Patch (SHARP) of the Solar 


can also be determined thanks to our unstructured adapted mesh 
model. The current profile (Fig. 3d) appears to be complex. It con- 
tains both direct and return currents, with some of them flowing in 


270 


Longitude (°) 


Dynamics Observatory data series in CHC. d, An iterative adapted scheme 
(on current and magnetic field) on a triangular mesh of the photosphere 
results in high-resolution boundary conditions only where necessary. 

The coloured-outline boxes in a indicate the positions of the patches 
provided by SDO, where the vector magnetic field has been measured; the 
dotted-outline region enlarged in b corresponds to the full-disk SDO/HMI 
magnetogram. 


background twisted ropes of non-negligible twist (Extended Data 
Fig. 2). Like the twist, the total magnetic energy W increases during 
the evolution (Extended Data Fig. 1), reaching a value of 1.8 x 10*4 erg 


Figure 2 | Magnetic field evolution before the major flare. Selected field 
lines of the reconstructed magnetic configurations starting on 24 October 
2014 at 16:00 ur are shown at different stages during the rising towards the 
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major X3.1 flare: a, at 16:00 ur = H — 5 (five hours before the flare); b, at 
H — 3; c¢, at H — 2; and d, at hour H*, ten minutes before the flare. 
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Figure 3 | TFR in the magnetic cage before the major confined eruption. 
We show characteristic properties of the configuration at H* in Fig. 2d. 

a, A TFR covering an active region the size of Jupiter has acquired a large 
twist (2.5 turns). b, Magnetic cage covering and maintaining the TFR. 

c, Larger-scale view of the multilayer structure of the magnetic cage. 


in the last computed configuration (a much smaller value was previ- 
ously claimed®). If we restrict our attention to the sub-region contain- 
ing the TFR, the stored magnetic energy is a factor of 1.09 above the 
potential energy W,o of the whole region, but a factor of 1.24 above 
the part of W,ot that is contained in the sub-region. 

Up to the time of the flare, the slowly evolving magnetic field with its 
embedded TER stays well confined. This is in accordance with the fact 
that W stays below the energy Wop of the associated semi-open field” 
(Extended Data Fig. 1; a field is expected to suffer a fast expansion 
and to open when W/W.op approaches unity). For completeness, we 
have also computed the standard decay index n of the potential field 
associated with the pre-eruptive configuration. n is found (Extended 
Data Fig. 3) to remain definitely below the critical value n.~ 1.5 that 
is generally considered to characterize the onset of the torus instability 
(this instability is thought to trigger the loss of confinement of the TFR 
in ejective events!”). This allows us to anticipate the confined character 
of the flare. We point out, however, that the applicability of the general 
argument is somewhat uncertain in the situation we encounter here. 
The quoted theoretical value of n, is computed by considering a 
configuration in which an idealized TFR stays in equilibrium in a 
background potential magnetic field, while the pre-eruptive confi- 
gurations exhibit non-negligible electric currents flowing outside the 
TFR (Extended Data Fig. 2). Hence the background field appears to be 
force-free rather than current-free, which may be expected to lead to a 
modification of n,. A similar point was made in a previous study of the 
decay index n just before the X3.1 flare’. 

To illustrate the importance of the background currents in the deter- 
mination of the actual magnetic field B, we find it useful to compare 
its decrease with that of an auxiliary force-free magnetic field B’ for 
which the electric currents are fully confined inside a TFR that differs 
as little as possible from the TFR of B (then B’ is current-free outside 
its TFR; see Methods). The difference between the behaviours of both 
force-free fields appears quite neatly (Extended Data Fig. 4), with B 
showing a much stronger value above the central part of the TFR. As 
for the free energies, we find that W — Woot 19 x 10°2 erg for B, while 
W’— Wpot® 13 x 10° erg for B’. Then there is a huge reserve of about 


Only two layers have been plotted. d, View of the electric current system 
associated with the TFR, revealed by the mesh adaptation scheme. The 
colour code is symmetric, with blue (red) for negative (positive) values and 
saturation fixed at (—0.36, 0.36) Mm7!. 


6 x 10° erg in the background, which would be sufficient by itself to 
power an X-class flare. 

The quality of our reconstructions has been checked by computing 
the evolution and the asymptotic convergence along the iterative scheme 
of two standard diagnostics measuring, respectively, the extent to which 
the field is force-free and divergence-free (Extended Data Fig. 5). 
We have also compared the pre-eruptive configuration computed at 
21:00 ut with various observational data (Extended Data Figs 6 and 7). 

The second step of our method consists of testing the stability of the 
previously obtained pre-eruptive state. Our tool here is our numeri- 
cal model METEOSOL™, which is a magnetohydrodynamics code in 
Cartesian geometry, and we perform the computation in a numerical 
box that is large enough to allow for possible ejective processes”. Each 
of the different states computed before 21:00 uT is taken as an initial 
state for the model, and is made to evolve by applying, in turn, dur- 
ing a short time, various boundary conditions”? that mimic physical 
processes occurring at the Sun’s surface: flux cancellation, converging 
motions and turbulent diffusion. We found that whatever the driving 
process, only the configuration of 21:00 uT leaves its equilibrium state, 
owing to the nonlinear development of a kink instability associated 
with the high value of the twist (see Methods). The TFR is most clearly 
seen changing its original shape into the helical one that is charac- 
teristic of the kink instability (Fig. 4). At the same time, it expands 
slightly, making it press against the cage above, which contains some 
non-negligible magnetic free energy associated with the presence of 
electric currents and shearing of the lines (this is the reserve of about 
6 x 10* erg mentioned above). This transformation of the configura- 
tion is favourable for triggering a reconnection process between the 
lines of the TFR and the lines forming the different layers of the cage. 
This leads to the conversion of some amount of magnetic energy that is 
tapped from both regions, and to the production of the observed flare. 

No fast expansion and opening of the field are obtained in our sim- 
ulations. The flare we compute is a confined one, and our results do 
not exhibit any disagreement with the conclusion above based on the 
decay index and the energy of the semi-open field. We also note that the 
shearing of the cage, which is made of magnetic lines whose direction 
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Figure 4 | Evolution and confined instability of the TFR. We show 
selected field lines of some configurations obtained during the evolution 
driven by small photospheric perturbations of the configuration of 

24 October 2014 at 21:00 ut, taken as the initial condition. a, Initially 


varies with altitude (Fig. 3c), may play an important part in the torus 
stability of the TFR, in the same way that the shearing of a field support- 
ing plasma against gravity may stabilize Rayleigh-Taylor modes. When 
the kink instability develops and the TFR deforms into a helical shape, 
the shearing ensures that in the cage above the TFR there are always 
magnetic lines having a non-negligible component orthogonal to the 
TER; this is the component that is important for the confinement”. 

Our results show that the role of the cage, and thus of the environment, 
crucially affects the class of eruption—either confined or eruptive— 
that can be produced in an active region. Indeed, applying flux can- 
cellation for a longer time and on a larger scale that includes the cage 
results in a decrease of the confinement that leads to a major disrup- 
tion of the structure and to the ejection of the TFR, that is, a coronal 
mass ejection is produced as in our previous work”? (Extended Data 
Fig. 9). Moreover, the maximum energy that could be stored in the 
configuration—which is the energy of the semi-open field—is found to 
be approached from below in the eruptive event (as previously found”’), 
whereas the energy is well below it in the confined one (Extended Data 
Fig. 1). Thus the free energy of the semi-open field appears to give an 
estimate of the energy that could be released from a region by the ‘most 
energetic possible eruption. 

The result previously obtained in idealistic numerical simula- 
tions'®-*° is thus now shown to occur on the real Sun. We conclude 
that the cause of the confined eruption is the instability of the TFR, and 
not the topological structure of the configuration. It should be noted 
that our model shows an interesting correlation between the location of 
the predicted magnetic structure and the real Sun observations during 
the eruption (Extended Data Fig. 8). 

Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Input data. The spherical magnetic data (Fig. 1) we use are composite, obtained 
from several patches of the vector magnetic field using the following technical 
procedure. To create the dataset, we use three different SDO/HMI products: the 
synoptic map of Carrington Rotation 2156 (a method of numbering the solar rota- 
tions as seen from Earth since 1853), the line-of-sight full-disk magnetogram in 
charged coupled device (CCD) coordinates, and the vector magnetograms from 
the SHARP data series in CCD coordinates. To avoid the impreciseness of the 
data obtained near the solar limb, we use only the vector magnetograms of the 
regions nearest to the solar active region AR12192. These vector magnetograms 
are transformed to spherical coordinates*! and then embedded in the line-of-sight 
full-disk magnetogram at their respective locations in the CCD coordinates. The 
resulting data are then remapped from the CCD coordinates to the heliographic 
coordinates*’, keeping only a spherical wedge 125° in longitude and 70° in latitude. 
Independently, the synoptic map is interpolated to a uniform latitude-longitude 
mesh. The polar field is fitted with a geometrical specification® to take care of the 
inaccuracy of the magnetic field measurements made near the solar poles. The final 
step is to embed this composite vector full-disk magnetogram into the synoptic 
map. To avoid discontinuities at the interfaces between the synoptic map and the 
full-disk map as well as between the fitted polar field and the unmodified synoptic 
field, the resulting radial component of the field, B,, is subjected to a diffusion 
process. The non-uniform diffusion coefficient is taken to be larger around these 
interfaces than in the rest of the composite map, and it is set to zero in the full-disk 
area to keep B, unchanged therein. As is required in any case”*, a mesh adaptation 
algorithm is used to create high-resolution mesh in the region where it is needed. 
This is shown in Fig. 1d for the reconstruction of 21:00 ut. We thus finally obtained 
a large composite map containing a lot of information that needs to be handled 
correctly by the model to capture the physics involved. 

The pre-eruptive multi-scale environment model. Our static calculations assume 
that the coronal magnetic field B is force-free, that is, we ignore the effects of the 
plasma and gravity forces. This is a common approximation (also used, for exam- 
ple, in ref. 8) that is well justified by the low value of the plasma beta (37 x 10°“) 
in the lower solar corona™, where the eruption is occurring. This approximation 
may be considered valid not only inside a single active region, but also in a larger 
part of the low corona, including several interconnected active regions. 

In our (static) model, the photosphere is represented by a sphere S of radius 1 
and the corona by the domain (2, which is contained between S and an outer sphere 
Sp of radius r= 2.57. Q is assumed to be filled up with low-{, slightly resistive and 
viscous plasma embedded in the magnetic field B. B obeys the standard force-free 
equations” 


V x B=o0B (1) 
V-B=0 (2) 
B-Va=0 (3) 


where a= a(r) is some scalar function that is constant along field lines according 
to equation (3). Beyond S, the physics should be dominated by the solar wind, 
which drives the magnetic field almost radially in the heliosphere. S is then called 
the source surface. 

Equations (1) to (3) are solved in 2 using an algorithm exploiting its mixed 
elliptic-hyperbolic structure. This algorithm has been well described in both 
Cartesian and spherical*® geometry. It requires as boundary conditions the normal 
component B, of B on S, the vanishing on S, of the tangential component of B, 
which thus continuously matches the outer radial field, and an assignment of the 
value of a to each magnetic line connecting the parts S* and S~ of S, where B, > 0 
and B, < 0, respectively. Here, we set a to be equal on any line to a weighted com- 
bination of the values a3, and aj, given by the observations at its two footpoints 
on S, with the weights (summing to 1) reflecting our relative confidence in the 
precision of these data. The actual computation of B is done by imposing such a 
boundary condition at each hyperbolic step of the iterative scheme*™*”. 

To take advantage of the richness of these high-resolution data, our multi-scale 
method solves equations (1) to (3) numerically using a scheme defined on an 
unstructured tetrahedral mesh that can be adapted along the iterative algorithm 
lying at the heart of our MESHMHD numerical code. To begin with, we look for 
the best triangular mesh adapted to the data given on the photosphere. This is 
done by starting with a crude triangular mesh and using a fixed-point algorithm. 
Adaptation followed by data interpolation (for a and B,) on the new mesh thus 
leads to an optimal mesh associated with the data, with high resolution where data 
information requires small triangles and lower resolution elsewhere. This complex 
procedure is needed to obtain the high-quality mesh shown in Fig. 1, which is then 
used to build an initial tetrahedral mesh inside (2. Our iterative algorithm” is next 


started by using this initial mesh. After each iteration, quantities called sensors, 
such as the magnetic field B and a, are used to adapt the mesh where small-scale 
structures form, such as electric current concentrations and the TFR (Fig. 2). This 
is continued using a fixed-point method** until the couple mesh/solution converges 
to a small interpolation error. 

Using this model, we obtain an equilibrium with very good standard diagnos- 
tics. The evolution and asymptotic convergence along the iterative scheme have 
been computed for two of these diagnostics (Extended Data Fig. 5): (J, B) and (|fi|). 
(J, B) is the angle between the electric current density J and B, which measures 
the extent to which B is force-free, and the standard parameter (|f;|) provides an 
estimate of the divergence of B. When convergence is obtained, (J, B) is smaller 
than 2°, while (|f|) + 10-1”. This means that the constraint V - B=0 is satisfied to a 
high degree of precision (| V - B| is at most equal to about 107’? GMm_'). This last 
result is a direct consequence of the structure of our numerical algorithm, which 
rests on a discretization of the equations that makes the magnetic field exactly 
divergence-free. We work in the functional spaces for element representation that 
are divergence-free on each tetrahedron, which ensures that B is in the kernel of 
the divergence operator. B is computed as the curl of a vector potential A, with B 
and A defined on each tetrahedron, while the divergence operator is defined 
by duality. 

To have such low values for these diagnostics is a necessary condition for 

the electric currents to organize themselves into a well formed TFR with high 
twist (Fig. 2) during the reconstruction process performed by our model, which 
increases the resolution of the mesh where the current and the magnetic field 
change over smaller scales. Moreover, comparisons with coronal and ribbon data 
provided by emission observations made by AIA (onboard the NASA/SDO mis- 
sion) show interesting correlations (see Extended Data Figs 6 and 7). 
Auxiliary force-free field B’. Electric currents flowing outside the TFR (denoted T) 
of the reconstructed force-free field B are far from being negligible in the active 
region we consider. This appears quite clearly in Extended Data Fig. 2, where we 
can see electric currents of both signs associated with the TFRs that fill the volume 
around or above T. But unfortunately their effects have not yet been taken into 
account in the theoretical force-free models that are generally used for establishing 
the stability criteria of a TFR. To quantify the importance of the electric currents, 
we introduced a new auxiliary force-free magnetic field, B’, constructed in such a 
way that all its electric currents are fully confined inside a TFR (denoted T’) that 
differs as little as possible from T. 

Let us denote as st/~ the intersection of T with the part S*/~ of positive/negative 

polarity of the solar surface. To compute B’, we first solve the same Grad-Rubin 
boundary value problem as for B, but with a change in the boundary conditions: 
on St, we require a to vanish outside s* instead of assuming the observed value. 
In this way, we obtain a field whose currents are concentrated in a tube having 
the same intersection with S* as T, but an intersection with S~ different from s~. 
Therefore, we set up an iterative procedure to transform that field into another 
field B’ that has a TFR T’ as close as possible to T. Each iterative step consists in 
solving a Grad-Rubin problem in which an appropriate value of a is imposed 
either on S* or S~. 
Magnetohydrodynamic evolution. Once several reconstructions have been made 
in the four hours before the eruption, we have several different equilibrium states 
that may be a priori expected to be stable, because otherwise the reconstruction 
scheme would not have converged. We therefore undertake a new step in which, 
with the help of a magnetohydrodynamic code, we determine how each of these 
equilibrium states evolves under the action of several processes that are known to 
occur on the solar photosphere: flux cancellation, flow convergence and turbulent 
diffusion. These processes are mimicked by properly chosen boundary condi- 
tions, and the effects of each of them are considered in turn. We do not perform a 
standard stability analysis’, in which one would study how an initial equilibrium 
state evolves either under the effects of some arbitrary superposed perturbations, 
or simply from the interpolation errors left when bringing data from one code to 
the other. Rather, we impose physically motivated nonlinear evolutions on our 
magnetic structure, which stay close to the actual evolution of the active region 
because the computations are run for only a short time. 

The tool we use is our efficient magnetohydrodynamics code METEOSOL™, 
which has already been extensively employed and has produced many results!®401, 
in particular in theoretical studies of the effects on a given magnetic structure of 
the surface processes mentioned above. Since the big flare that occurred around 
21:00 uT was not ejective, the calculations are performed on a high-resolution 
METEOSOL mesh set in a large Cartesian box (such a box has been proved 
by previous simulations of coronal mass ejections not to exert any spurious 
confining effect on the magnetic field, thus ensuring that the resulting erup- 
tion is physically confined and not artificially confined by the box). The mesh 
used by METEOSOL is a staggered Cartesian non-uniform one. The spatial 
discretization of the operators is defined in such a way that magnetic helicity 
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and topology are conserved in the weakly resistive limit and that the constraint 
V-B=0 is satisfied, to round off errors. METEOSOL uses an efficient time- 
advance semi-implicit scheme. Some modifications have been introduced since 
the original version of METEOSOL”. In particular, the convective terms are 
treated by a third-order non-oscillatory upwind scheme. The model uses several 
different assumptions and options, among which a classical zero-(3 approach, with 
an imposed profile of density that is either constant or decreasing with altitude. 
We have also tried an initial plasma hydrostatic equilibrium, assuming the adia- 
batic approximation in which the heating terms are neglected. This led to only 
minor changes in the results. For the viscosity and the resistivity, we choose val- 
ues between 10°? and 10-3 (in code units) and in the range 10 * down to 10°°7 
respectively. The results were found to be independent of the values selected for 
these parameters. 

Applying the method above, we find that the evolution of the pre-eruptive con- 
figuration of 21:00 ur is driven by any of the three surface mechanisms previously 
listed, which leads after a short time to the development of the kink instability of 
the TER, which otherwise stays confined (Fig. 4). No instability is found to develop 
during a short period of time when the same procedure is applied to earlier static 
configurations, which we observed to evolve slightly and inconsequentially. 

A further result concerns the effect of applying flux cancellation for a longer 
time and on a larger scale including the magnetic cage. In that case, the resulting 
decrease in the confinement leads to a major disruption of the structure and to an 
ejection of the TER, that is, a coronal mass ejection is produced as in our previous 
work” (Extended Data Fig. 9). This result provides one more illustration of the 
confining power of the cage and then of the effect of the environment on the nature 
of the event (a confined flare versus eruptive flare plus coronal mass ejection) that 
can be produced. 

During the development of the instability of the 21:00 ur (hour H) pre- 
eruptive configuration, magnetic helicity is conserved, keeping a value of about 
—300 x 1047 Mx? (1 maxwell = 1G cm’; Fig. 4d). However, the configuration 
does not relax between the pre-eruptive and post-eruptive phases to a linear 
force-free field having the same helicity; that is, the Taylor conjecture* is not 
valid here. Instead, nonlinearities are present in the predicted post-eruptive 
state, which is otherwise found to be close to the state obtained by applying 
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our static model to the HMI observations made on 25 October 2014 (Extended 
Data Fig. 10). 

Code availability. We have decided not to make our code available. Several parts of 
the code we used are embedded in a complicated way, and are the private property 
of the authors’ institutions. The reader, however, should be able to reproduce our 
results by using the information provided in Methods. 

Data availability. The data files that support our analysis will be made available 
upon reasonable request. 


31. Allen Gary, G. & Hagyard, M. J. Transformation of vector magnetograms and 
the problem associated with the effects of perspective and the azimuthal 
ambiguity. So/. Phys. 126, 21-36 (1990). 

32. Thompson, W. T. Coordinate systems for solar image data. Astron. Astrophys. 
449, 791-803 (2006). 

33. Titov, V. S., Mikic, Z., Tordk, T., Linker, J. A. & Panasenco, O. 2010 August 1-2 
sympathetic eruptions. |. Magnetic topology of the source-surface background 
field. Astrophys. J. 759, 70 (2012). 

34. Priest, E.R. Magnetohydrodynamics of the Sun (Cambridge Univ. Press, 2014). 

35. Amari, T., Aly, J.-J., Canou, A. & Mikic, Z. Reconstruction of the solar coronal 
magnetic field in spherical geometry. Astron. Astrophys. 553, A43 (2013). 

36. Amari, T. & Aly, J.-J. Observational constraints on well-posed reconstruction 
methods and the optimization-Grad-Rubin method. Astron. Astrophys. 522, 
A52 (2010). 

37. Wheatland, M.S. & Régnier, S. A Self-consistent nonlinear force-free solution 
for a solar active region magnetic field. Astrophys. J. 700, L88-L91 (2009). 

38. Alauzet, F., Frey, P. J., George, P. L. & Mohammadi, B. 3D transient fixed point 
mesh adaptation for time-dependent problems: application to CFD 
simulations. J. Comput. Phys. 222, 592-623 (2007). 

39. Guo, Y., Xia, C., Keppens, R. & Valori, G. Magneto-frictional modeling of coronal 
nonlinear force-free fields. |. Testing with analytic solutions. Astrophys. J. 828, 
82 (2016). 

40. Amari, T., Luciani, J.-F, Aly, J.-J., Mikic, Z. & Linker, J. Coronal mass ejection: 
initiation, helicity and flux ropes. Il. Turbulent diffusion driven evolution. 
Astrophys. J. 595, 1231-1250 (2003). 

Al. Amari, T,, Aly, J.-J., Luciani, J.-F., Mikic, Z. & Linker, J. Coronal mass ejection 
initiation by converging photospheric flows: toward a realistic model. 
Astrophys. J. 742, L27 (2011). 

42. Taylor, J. B. Relaxation of toroidal plasma and generation of reverse magnetic 
fields. Phys. Rev. Lett. 33, 1139-1141 (1974). 


© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


LETTER 


Magnetic Energy and Twist 
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Extended Data Figure 1 | Evolution of magnetic energies and twist. Evolution during the four hours preceding the eruption of the actual magnetic field 
energy (blue), the potential field energy (black), and the semi-open field energy (purple), expressed in physical units. The evolution of the twist is also 
plotted (red). 
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Extended Data Figure 2 | Electric current structure of the magnetic than 0.5. These ropes are located around and above the major TFR and 
environment. Set of selected flux ropes, including the central highly reveal a complex non-potential environment. Those in blue (red) are 
twisted flux rope and some ropes having a non-negligible twist of more associated with a negative (positive) value of a. 
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Extended Data Figure 3 | Index of torus instability. a, Magnetogram at (with the same colour coding) by using the horizontal component”? of the 
21:00 uT with rectangles indicating the sample area in which the index n mean potential (current-free) magnetic field, (B,,,). The horizontal line 
is computed. The yellow rectangle is located just below the TFR, while the indicates the critical value of the index often used for the torus instability, 
black rectangle, used as a reference, is chosen outside it. b, Variation with while the vertical one indicates the height of the TFR axis. 


altitude of the torus index computed above the sample areas shown in a 


© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


Central rectangle 


800 


400 


<||B||(z)> (Gauss) 


200 


(0) | | | 1 ia | Lo | i 
0 20 40 60 80 100. 120 140 
Height (Mm) 


Extended Data Figure 4 | Role of volumetric current. Variation of the 
magnetic intensity mean value above the rectangles shown in Extended 
Data Fig. 3a for the current-free solution (red curve), the force-free 
solution B’, computed by removing the photospheric electric current 
exterior to the base of the TFR (green curve), and the full force-free 
solution (B), computed by taking into account the total electric current 
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(blue curve). The number ksi measures the degree of removal of those 
external currents, with ksi= 0 (ksi= 1) indicating total (no) removal. 

As in Extended Data Fig. 3, the vertical yellow line indicates the height of 
the TER axis. a, The central rectangle; b, the reference rectangle outside 
the TFR area. 
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Convergence Diagnostics 
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magnetic field (a good measure of how force-free the solution is) during number. 
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Extended Data Figure 6 | Comparison with AIA data. a, Selected field b, Corresponding composite image from the AIA-131 A and AIA-171A 


lines and set of different isosurfaces of the force-free scalar function a wavelength data. c, Synthetic emissivity computed by using the magnetic 
(red for positive values and blue for negative values) for the reconstructed field and the electric current density of the reconstructed pre-eruptive 
pre-eruptive magnetic configuration of 24 October 2014 at 21:00 ur. magnetic configuration of 24 October 2014 at 21:00 uT. 
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Extended Data Figure 7 | Comparison with AIA-1,600 A wavelength. pre-eruptive magnetic configuration of 24 October 2014 at 21:00 uT. This 
a, AIA emission at 1,600 A. b, Plot of the vertically integrated dissipation plot highlights the strong electric current regions, in which reconnection 
F’ (where J is the norm of the electric current density) above the regions is expected to occur. 


with high values of the squashing factor*, as for the reconstructed 
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Extended Data Figure 8 | Extreme-ultraviolet emission and magnetic structure. a, Selected field lines of the evolving magnetic configuration during 
the flare. b, AIA emission at 94 A on 24 October 2014 at 22:00 ur. c, Synthetic emissivity computed by using the magnetic field and the electric current 
density of the evolving magnetic configuration during the flare. 
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Extended Data Figure 9 | Major eruption and role of the magnetic October 2014 at 21:00 ur when flux cancellation was applied on a larger 
environment. Selected field lines of the configuration that evolved into scale, including the magnetic cage, whose confinement effect has thus 
a major coronal mass ejection from the pre-eruptive configuration of 24 been weakened. 
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Extended Data Figure 10 | Post-eruptive state. Comparison of the post-eruptive states obtained from the simulation after the full relaxation of the 
evolving unstable state (a) and using HMI vector magnetic data from 25 October 2014 (b). 
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Isomer depletion as experimental evidence of 
nuclear excitation by electron capture 


C. J. Chiara!, J. J. Carroll’, M. P. Carpenter’, J. P. Greene’, D. J. Hartley’, R. V. F. Janssens*+, G. J. Lane®, J. C. Marsh!+, D. A. Matters°, 
M. Polasik’, J. Rzadkiewicz®, D. Seweryniak?, S. Zhu’, S. Bottoni3+, A. B. Hayes’ & S. A. Karamian!+ 


The atomic nucleus and its electrons are often thought of as 
independent systems that are held together in the atom by their 
mutual attraction. Their interaction, however, leads to other 
important effects, such as providing an additional decay mode 
for excited nuclear states, whereby the nucleus releases energy by 
ejecting an atomic electron instead of by emitting a y-ray. This 
‘internal conversion has been known for about a hundred years 
and can be used to study nuclei and their interaction with their 
electrons!~’. In the inverse process—nuclear excitation by electron 
capture (NEEC)—a free electron is captured into an atomic vacancy 
and can excite the nucleus to a higher-energy state, provided that 
the kinetic energy of the free electron plus the magnitude of its 
binding energy once captured matches the nuclear energy difference 
between the two states. NEEC was predicted‘ in 1976 and has not 
hitherto been observed*®. Here we report evidence of NEEC in 
molybdenum-93 and determine the probability and cross-section 
for the process in a beam-based experimental scenario. Our results 
provide a standard for the assessment of theoretical models relevant 
to NEEC, which predict cross-sections that span many orders of 
magnitude. The greatest practical effect of the NEEC process may be 
on the survival of nuclei in stellar environments’, in which it could 
excite isomers (that is, long-lived nuclear states) to shorter-lived 
states. Such excitations may reduce the abundance of the isotope 
after its production. This is an example of ‘isomer depletion, which 
has been investigated previously through other reactions® ”, but is 
used here to obtain evidence for NEEC. 

We searched for evidence of the NEEC process in °*Mo following an 
approach adapted from a technique proposed in ref. 13. Previous theo- 
retical work also considered °2Mo as a NEEC candidate’. This nucleus 
has a21/2* isomer at 2,425 keV with a half-life of 6.85 h and a (17/2)* 
candidate intermediate state that lies 4.85(9) keV higher at 2,430 keV 
(ref. 14), as illustrated in Fig. 1. (Uncertainties are quoted at the 1 or 
68% confidence level, unless otherwise noted.) In our approach, °*Mo 
is produced in its metastable state (?°"Mo) through nuclear reactions. 
Choosing heavy projectiles and light target nuclei results in recoiling 
*3m™Mo reaction products (‘recoils’) that move at high velocities v 
(initially more than 10% of the speed of light, c) in approximately the 
same direction as the beam. As the fast-moving recoils pass through the 
target medium, electrons are stripped off, leaving the °3™MVo ions with 
a high average charge'>'® between about +32 and +36. Subsequent 
collisions with target atoms reduce the energy of the recoiling ions 
while simultaneously providing electrons that can be captured back into 
the vacated atomic orbitals. At the right combination of the charge state 


of the *"Mo ion and the effective electron kinetic energy, as seen from 
the reference frame of the recoiling ion, capture can occur, releasing 
enough energy to match that needed (AE = 4.85 keV) to excite the 
nucleus from the isomer to the intermediate state—that is, NEEC 
occurs. As described in detail by recent theoretical calculations!®, these 
NEEC conditions could be fulfilled in practice by capture of an electron 
at the necessary relative kinetic energy, to within the line width of the 
capturing atomic vacancy (the NEEC resonance width). The interme- 
diate state in °*Mo that would be populated through NEEC is known 
to decay (with a half-life of t),.=3.5ns) to the ground state through a 
characteristic sequence of )\-rays at 268 keV, 685 keV and 1,478 keV 
(Fig. 1). Notably, the 268-keV transition would never be seen in the 
natural decay of the 21/2* isomer. 

The Zr + ’Li fusion-evaporation reaction was selected for our 
experiment, which was performed at the ATLAS facility at Argonne 
National Laboratory. An 840-MeV Zr beam was provided with an 
average beam intensity of about 6 x 10° ions s~'. The multi-layer 
target was composed of ’Li supported by a natural carbon (""'C) foil, 
followed by a gap of about 3mm and an additional ™'C layer, backed 
with 7°8Pb (see Methods for the choice of reaction and target construc- 
tion). The target was positioned at the centre of the Gammasphere 
‘\-ray spectrometer’’, which, at the time of the experiment, comprised 
92 Compton-suppressed, high-purity Ge detectors arranged in 16 rings 
of constant angle relative to the beam direction. A minimum of three 
-rays within a 2-j1s coincidence window was required for events to 
be recorded to disk, although a narrower, sub-microsecond coinci- 
dence constraint was imposed in the offline analysis. We operated the 
array with a digital data acquisition system (Digital Gammasphere’®) 
at average event rates of 40-50kHz. We collected data for about 62h 
in this configuration. 

After the fusion of the *°Zr projectiles and the 7Li target nuclei, and 
the subsequent evaporation of a proton and three neutrons from the 
compound nuclear system, excited states above the °3™\[o isomer 
were populated. These excited states depopulated through decay paths 
that, in some cases, fed the 21/2* isomeric state, such as through the 
2,475-keV, (25/27) — 21/2 transition (parentheses denote tentative 
assignments). If a recoiling **Mo ion is in this isomeric state when 
the energy—charge resonance conditions are met, NEEC may occur, 
shifting some of the population of the isomer to the 3.5-ns intermediate 
state. An observable signature of NEEC would then be the detec- 
tion of a 2,475-keV +-ray, which would confirm that the nucleus had 
reached the "Mo state, within the same coincidence window as one 
or more 1-rays from the decay of the intermediate level. The 2,475-keV 
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Figure 1 | Relevant part of the °>Mo decay scheme. The energy and the spin 
and parity quantum numbers (J, where I denotes the spin and 7= +, — the 
parity) are noted for each level, with tentative assignments in parentheses. 
The half-lives 1/2 given for the two isomeric states are from ref. 14. Level and 
‘y-ray energies are from this study and are given in kiloelectronvolts. The 
10-keV transition between the (13/27) and (11/27) states was not observed 
in this work, but was inferred from the measured coincidence relationships. 
The NEEC transition from the isomer to the intermediate state is indicated 
by the dashed red line. The key 268-keV ~-ray is also indicated in red. 


\-ray would not ordinarily be found in true coincidence with the 
685- or 1,478-keV transitions because of the narrow width (2 1s) 
of the coincidence window compared to the 6.85-h half-life of the 
metastable state, nor would it ever be coincident with the 268-keV 
7-ray. These coincidences would be possible, however, if some mecha- 
nism, such as NEEC, excited the nucleus from the long-lived isomer 
to the intermediate state. 

Each recorded event, consisting of the data for three or more coin- 
cident --rays, was decomposed into all the constituent subsets of three 
*\-rays. Energy conditions, or ‘gates, were imposed to select two +-rays 
out of each set of three, and the counts in a coincidence spectrum were 
incremented at the energy of the third photon. Background counts, 
determined by gating on energy ranges near the peaks of interest, were 
subtracted (see Methods for the gating and background-subtraction 
procedures, and for the impact of the Doppler effect). A back- 
ground-subtracted spectrum, produced by a double gate on the 2,475- 
and 1,478-keV transitions, is presented in Fig. 2a. For comparison, a 
background-subtracted spectrum, single-gated on just the 1,478-keV 
*-ray (that is, the second -\-ray can be at any energy), is also shown in 
Fig. 2b. In the latter spectrum, the more intense transitions that are 
coincident with the 1,478-keV ~-ray (see Fig. 1) are clearly visible. With 
the additional constraint of a gate on 2,475 keV, these lines disappear 
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Figure 2 | Spectra demonstrating the signature of NEEC in °*Mo. No 
correction for the Doppler effect has been applied. a, Spectrum obtained 
with a double gate on the Doppler-shifted 2,475-keV and unshifted 
1,478-keV +-rays. b, Spectrum obtained with a single gate on the unshifted 
1,478-keV line. c, Spectrum obtained with a double gate on the Doppler- 
shifted 2,475-keV and unshifted 268-keV ~-rays. Peaks of *Mo shown 

in Fig. 1 are labelled with their energies in kiloelectronvolts. Additional 
known *Mo transitions, not shown in Fig. 1, are marked with asterisks 

in b. The label ‘ete~ indicates the 511-keV electron—positron annihilation 
peak. We note that transitions located above the isomer are too spread out 
in energy by the Doppler effect to be visible in these spectra. 


apart from the 268- and 685-keV peaks (Fig. 2a). The counts in the key 
268-keV line are about 7c above background, where a is the standard 
deviation of the counts in the nearby background region (statistical 
error), or over 30 when the uncertainty associated with background 
subtraction is also taken into account (statistical plus systematic errors, 
as described in Methods). A similar gating procedure on the 2,475- and 
268-keV transitions reveals peaks at 685 keV and 1,478 keV, as seen in 
Fig. 2c. Combined, these spectra demonstrate that the -\-ray sequence 
2,475 keV—268 keV—685 ke V-1,478 keV occurs in coincidence, which 
is the expected signature of isomer depletion via NEEC. We note that 
unlike previous examples of isomer depletion®*!’, here the energy of 
the excitation to the intermediate state is much smaller than that of 
the isomer itself. 

To determine the probability of this excitation, P.x., we summed 
background-subtracted spectra that were double-gated on the Doppler- 
shifted lines at 241 keV and 1,442 keV and at 241 keV and 686 keV (see 
Methods for details). Two versions of this spectrum were generated, as 
shown in Fig. 3. One had a Doppler correction applied for v/c=0.109 
recoils (Fig. 3a), so that the area of the 2,475-keV peak could be deter- 
mined. The other spectrum had no Doppler correction (Fig. 3b) and 
was used to obtain the area of the 268-keV peak. As there is no identi- 
fied decay path from the gating transitions to the 268-keV 4-ray, aside 
from those involving an excitation such as NEEC, the ratio of the areas 
of the 268- and 2,475-keV peaks, when corrected for their detection 
efficiencies, yields the probability that a °°"Mo ion in the long-lived 
isomeric state (reached via the 2,475-keV transition) subsequently 
emits a 268-keV 4-ray. An additional, small correction for internal 
conversion provides the excitation probability, which we determined 
to be Pexc=0.010(3). The corresponding cross-section, averaged over 
the full target thickness, is 40 b (see Methods). This is a lower limit, 
however, as NEEC is a resonant process; the peak cross-section may 
be much larger. Given that existing cross-section calculations involve 
very different conditions, such as channelling in a crystal”®”? or in laser- 
induced plasmas””, and predict values spanning many orders of mag- 
nitude, an in-depth theoretical analysis of our experimental setting is 
needed. Such analysis is beyond the scope of this study. 

At the Heavy Ion Accelerator Facility (HIAF) at the Australian 
National University (ANU), we explored whether the observed 
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Figure 3 | Spectra used to determine the NEEC probability in *Mo. The 
spectra were generated by applying double gates on the Doppler-shifted 
241-keV peak and either the 1,442- or 686-keV transitions, placed so as 

to avoid overlap with the 1,478- and 685-keV °°Mo lines from stopped 
nuclei at particular angles. a, High-energy part of the spectrum, with 
Doppler correction. b, Low-energy part of the same spectrum, but with 
no Doppler correction. The asterisks mark known transitions in **Mo; the 
contaminant labelled with a ‘c’ comes from *°Ru produced in reactions 
between the °°Zr beam and the °C layer of the target. 


coincidences could have originated from a different reaction in our 
experiment. A control reaction between ’Li and *°Zr that created °*Mo 
recoils with energies that were too small to produce the ionization and 
electron kinetic energies required for NEEC indeed did not yield the 
coincident y-rays that would be a signature of NEEC. Additionally, these 
-\-rays cannot be attributed to reactions between the *°Zr projectiles 
and the C or Pb layers in the target (see Methods). We also calculated 
the probability of °°"Mo being excited to the intermediate state through 
inelastic scattering (Coulomb and nuclear interaction) at recoil energies 
above the Coulomb barrier in 7Li and in !*C with the coupled-channel- 
reaction code FRESCO”, and through Coulomb excitation at energies 
below the barrier in 7°8Pb with the semi-classical coupled-channel 
Coulomb-excitation code GOSIA”® using RACHEL’. The resulting 
probabilities are 6 x 10-8, 2 x 10° and 3 x 10~° in the Li, C and Pb 
target layers, respectively, all of which are too small to account for the 
experimental value P.x-=0.010(3) deduced from our data. Thus, the 
observed coincidences that provide experimental evidence of NEEC 
do not appear to originate from contaminant reactions or from other 
well established excitation mechanisms. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Reaction choice and target construction. The choice of the beam-target 
combination for this experiment was based on estimates of the production 
cross-section of "Mo compared to other reaction products and by the need for 
recoil energies to exceed the NEEC threshold. The cross-sections were calculated 
using the Monte Carlo fusion—-evaporation code PACE4”* and qualitatively verified 
experimentally at the HIAF at ANU (C,J.C. et al., manuscript in preparation). 
A solid target was preferred over He gas’? to provide a more compact setup inside 
Gammasphere (a gas cell tens of centimetres in length would have been needed) 
and to strip traversing ions to a higher charge state!®. The °°Zr + 7Li combination 
suitably met these criteria. 

The construction and handling of the Li target needed to address several 
considerations. (1) Li readily oxidizes when exposed to air. (2) Because the 
decay sequence that is indicative of NEEC involves an intermediate state with 
ti. =3.5ns, enough stopping material must be used to prevent reaction products 
from moving downstream of the target, out of the view of the Ge detectors. 
(3) Fusion reactions occurring beyond the first few milligrams per square centi- 
metre of a Li target would produce residues with recoil energies too low to satisfy 
the NEEC resonance condition. This would yield an incorrect value for the NEEC 
probability with our method (see Methods section “Excitation probability and 
cross-section), as the 2,475-keV transition to the isomer could be observed for 
some events with no possibility of NEEC occurring. Thus, stopping all reaction 
products with a thick Li target (about 15mg cm ~* would be required; http://www. 
srim.org) is not desirable. (4) Beam ions that traverse the Li layer without reacting 
can also potentially induce reactions, such as fusion—evaporation or deep-inelastic 
reactions, with any material placed behind the Li as a stopper. Such reactions, 
particularly deep-inelastic processes, would yield a large background from dozens 
of different nuclei, potentially clouding the NEEC signature being sought. This 
contamination can be considerably reduced by ensuring that the energy of the 
beam ions incident upon the stopper is reduced to below the Coulomb barrier of 
the stopping material (ideally, an element with large atomic number Z). (5) For a 
given ion kinetic energy, a low-Z stopping material is more effective at stripping 
electrons from the ions down to the inner atomic shells!®, a necessary condition 
for NEEC. (6) The half-lives of most states above the 21/2* isomer are not known, 
and some may be comparable to or exceed the stopping time of residues in the 
target or stopper (of the order of picoseconds). In our approach, observation of a 
~-ray directly feeding the isomer was required to ensure that the isomeric state was 
indeed populated. If the residue has slowed substantially before emission of that 
*\-ray, the recoil energy may already lie below the NEEC resonance, preventing 
this process from occurring. 

To best address these conflicting requirements, the target was prepared as 
follows. A 1.55 mg cm’? deposit of "Li (abundance of 7Li in ™Li, about 92%) was 
evaporated onto a 0.50mg cm -*™"C foil. For the stopping material, 33 mg cm * of 
208Pb was placed behind a 4.2mg cm * layer of ™C. The target and stopper were 
mounted together on a frame with a gap of about 3 mm between the two C layers, 
as illustrated in Extended Data Fig. 1a. To minimize the risk of oxidation, the target 
was prepared under vacuum in an evaporator at the ATLAS target laboratory and 
transported to Gammasphere without breaking the vacuum using the technique 
described in ref. 29. The target was inserted into the beam path at an angle of 27° 
from the vertical, increasing the effective thickness of each layer by 12% (Extended 
Data Fig. 1b). With this target configuration, “Li(°°Zr, p3n)Mo reactions in the 
first layer can produce Mo at high spins and with large recoil energies. The kinetic 
energy decreases in the remaining Li and thin C layers, before the recoils enter the 
gap. Then, they drift with constant velocity through vacuum for about 100 ps. Any 
decays with a cumulative timescale smaller than tens of picoseconds will occur 
predominantly before the Mo ions reach the thicker C layer across the gap; this 
ensures that the kinetic energies of °°Mo recoils in the isomeric state are still above 
the NEEC threshold. (A target arrangement with a thicker layer of Li evaporated 
directly onto a°°Pb backing, with no gap, was used initially. With this target, the 
line shape of the 2,475-keV *)-ray that feeds the isomer was found to have a sizable 
narrow component from stopped nuclei. Introducing a gap between the Li target 
and the stopper eliminates this component, as demonstrated in Extended Data 
Fig. 2.) The recoils then slow down further in the thicker C layer, where NEEC can 
occur. Finally, the Mo ions come to rest in the ?°*Pb backing, permitting obser- 
vation of -\-rays emitted from longer-lived states, such as the 3.5-ns intermediate 
state. The combined thickness of the C layers was chosen to reduce the °°Zr beam 
energy to below the Coulomb barrier with 7°°Pb, so that only a limited number 
of well known +-ray transitions produced through Coulomb excitation of °°Zr or 
208Pb would be expected. 

Spectrum construction and background subtraction. The broad range of 
level half-lives in the reaction products, combined with the arrangement of a 
target and stopper separated by a vacuum gap, resulted in complex ‘\-ray energy 
spectra: prompt +\-rays were emitted by moving nuclei (average velocity in the 
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gap, v=0.109c), with their energies shifted and peaks broadened owing to the 
Doppler effect, while slower decays were from stopped nuclei, with no energy 
shifts or Doppler broadening. Although a correction for the Doppler shift can be 
applied for a given residue velocity and direction of \-ray emission, the origin of the 
+-rays (from moving or stopped nuclei) is not known on an event-by-event basis. 
Therefore, it was not possible to apply the appropriate Doppler correction to all 
+\-rays simultaneously. For the transitions of interest here, the 2,475-keV +-ray, 
which lies above the 21/2* isomer, is Doppler shifted, while the 268-, 685- and 
1,478-keV +-rays all lie below the 3.5-ns state and would appear as narrow, 
unshifted lines. 

The anticipated NEEC signature in this experiment would be the observation 
of a Doppler-shifted 2,475-keV transition in coincidence with the sequence of the 
268-, 685- and 1,478-keV +-rays from stopped nuclei, bypassing the 21/2* isomer. 
Double-gated, background-subtracted energy spectra were generated as follows. 
A gate at a particular energy, denoted g in the following, means that the energy of 
a‘\-ray in the detected events fell within designated limits; this gate is composed 
of the peak of interest (p) atop a smooth background (b), g=p + b. A separate 
energy gate was placed on a nearby, flat part of the spectrum to approximate the 
background component of the gate. Peak—peak (background-subtracted) coinci- 
dence spectra were then constructed for double gates on peaks 1 and 2 using the 
linear combination 


P\P2 = (& — b1)(& — b2) = 88 b18 + bybz (1) 


where each term on the right-hand side represents the energy bins of a spectrum 
of the -\-rays in coincidence with those falling within the indicated gate and/or 
background regions. 

Although the y-ray spectrum is sparser around 2.5 MeV than at lower energies, 
the Doppler-shifted and broadened 2,475-keV peak spans an energy range of 
approximately 500 keV when observed over all of the Gammasphere detector 
angles, overlapping with various lines from stopped nuclei at specific angles. For 
each ring of detectors, the gate and background windows were selected so as to 
limit inclusion of such overlapping contaminant lines. The gate on the narrow 
1,478-keV line from stopped nuclei was considerably simpler, as the peak is at 
the same energy in all rings. However, for this gate it was necessary to reject 
Gammasphere rings 6 to 8 (70° to 81°) because of interference from the Doppler- 
shifted 1,442-keV transition at those angles. For each transition (and ring), the 
width of the background window was made the same as that of the corresponding 
gate. Extended Data Fig. 3 presents the four component spectra of equation (1) 
for the double gate on the 2,475-keV and 1,478-keV peaks, which was used for 
Fig. 2a. The energies in these spectra have not been corrected for the Doppler effect 
in order to show the lines from the stopped nuclei. 

Gating in a nearby energy region to approximate the background under a 
discrete peak is a well established technique”. For the 2,475-keV ray, however, 
the Doppler broadening of the peak at each angle (full-width at half-maximum 
of tens of kiloelectronvolts, compared to just a few kiloelectronvolts for lines 
from stopped nuclei) requires that the background gate be placed farther from 
the peak position than usual. We therefore investigated the effectiveness of our 
background subtraction method for the 2,475-keV gate. A close inspection of the 
spectra in Extended Data Fig. 3 reveals that, for the most part, the same peaks 
appear in each spectrum, with exceptions including the lines at 123, 203, 770 
and 963 keV. These four ~\-rays are only visible above the background in spectra 
£182 and b1g>; this indicates that they are all in true, prompt coincidence with 
the 1,478-keV transition (Fig. 1), but they are coincident with only background 
Compton-continuum 1-rays in the energy region near 2,475 keV. As these 
+-rays should not be in true coincidence with the double gate on 2,475-keV and 
1,478-keV (even if NEEC occurred), they are considered background lines. To 
ensure that the background is suitably subtracted, the spectra gip2 = g1g2 — gib2 
and b,p2= big) — bb were constructed. The peak areas of the four background 
+\-rays were fitted in both spectra and the ratio k= (area in gip2)/(area in b)p2) 
was calculated for each. The average ratio for the four transitions, kaye, essen- 
tially describes the amount of background attributable to Compton 4-rays in the 
2,475-keV gate that needs to be subtracted to eliminate these peaks, with the 
resulting spectrum defined as 


&bo 


PP = 8iP2 — Kaveb Pr = (8182 — 81b2) — Kavelb1 8) — byb2) (2) 


We find kaye = 0.99(8), which means that equations (1) and (2) are equivalent 
here. The pip. spectrum in Fig. 2a, which uses kaye = 0.99, demonstrates that the 
background lines are eliminated whereas the 268- and 685-keV peaks remain. By 
contrast, eliminating these two lines would require k= 1.33(7) and 1.21(7), respec- 
tively; if these peaks originated from true coincidence with the 1,478-keV gate and 
the Compton background around 2,475 keV, their k values would also be close 
to kaye= 0.99. We used a similar procedure for the double gate on the 2,475- and 
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268-keV peaks (with a different set of +-rays in true coincidence with the 268-keV 
gate) and determined an average ratio kaye= 1.03(10); the central value 1.03 was 
used for the spectrum in Fig. 2c. 

The method of subtracting spectra gated on energies near the peaks of interest 

eliminates contributions from the smooth Compton background. We estimate 
from the °°Mo reaction rate that potential contributions of chance coincidences 
(prompt *+y-rays from two independent reactions arriving within the same coin- 
cidence window) are small, but would be accommodated by a larger value of kaye 
to remove the known background lines; the fact that kaye © 1 suggests that chance 
coincidences are negligible. A final source of background is the long-lived decay of 
the 21/2* isomer, which appears at an effectively constant rate in the microsecond- 
scale coincidence window; however, since the 268-keV transition does not appear 
in the natural decay of the 6.85-h isomer, these random coincidences do not 
interfere with the results. 
Excitation probability and cross-section. For the background-subtracted spectra 
presented in Fig. 3, gates were placed on the Doppler-shifted 241-keV peak and 
on either the 1,442-keV or the 686-keV peaks in each ring of Gammasphere. We 
took particular care to avoid overlaps at certain angles between these gates and the 
1,478- and 685-keV peaks from stopped nuclei. The Doppler-corrected 2,475-keV 
peak in Fig. 3a does not have a simple Gaussian shape. Rather than fitting the 
peak by constraining it to a particular shape, we simply determined the area above 
the background by taking the total number of counts within the energy range of 
the peak and subtracting the average background derived from the surrounding 
region of the spectrum. We obtained an area of Az475 = 1.54(3) x 104 counts. In 
the spectrum of Fig. 3b, in which no Doppler correction was applied, the 268-keV 
peak is a narrow line from stopped nuclei that could be fitted (along with the three 
other peaks of the spectrum) with a Gaussian atop a flat background, yielding an 
area of Ax6g = 5.1(16) x 10? counts. Although we did not measure the absolute 
detector efficiencies ¢., the relative efficiencies of Gammasphere at 2,475 keV and 
268 keV (see Methods section ‘Energy and efficiency calibrations’) are sufficient 
for determining the ratio R= €+,2475/E4,263 = 0.28. By factoring in the intensity of 
the unobserved internal-conversion branch, which is a= 0.0355 times the 7-ray 
intensity of the 268-keV E2 transition (this branch is negligible for the 2,475-keV 
transition; see ref. 3 and http://bricc.anu.edu.au), we calculated the excitation 
probability as Pexc = RA26g(1 + a)/A2475 = 1.0(3)% for °™Mo nuclei traversing 
the C target. 

As a comparison, in ref. 13 a probability of 0.01% is calculated for NEEC 
in °>Mo, but it is stated that this value could be much higher because certain 
parameters were not very well known. Although the experimental conditions 
proposed in that work for the observation of NEEC are similar to those used in 
our experiment, the excitation probabilities cannot be directly compared because 
the predicted value is dependent upon NEEC occurring in a specific length of 
helium gas!3, Similarly, this sensitivity of the excitation probability to the experi- 
mental conditions prevents a reliable comparison of our measurement with other 
existing theoretical predictions. 

The excitation probability P.,. can be expressed in terms of a cross-section 
Fexe = Nexcl (Nprojt) where Nexc and Nproj are the numbers of excitations and 
projectiles, respectively, and n is the number of target atoms per unit area. The ratio 
Nexc/Nproj here is equivalent to Pex:, while n can be written as n = pNa/A;, where 
pis the target surface density, A; is the mass per mole of the target and Na is the 
Avogadro constant. The full thickness of C in the target had p= 5mg cm? and 
the average cross-section across this thickness would be 40 b. This implies a much 
larger peak cross-section for the resonance, but it is worth noting that rather large 
values have been suggested for other scenarios*. 

Alternative reactions and excitation mechanisms. The same reaction that was 
examined in this work was investigated at ANU with the beam and target nuclei 
interchanged and at a similar centre-of-mass energy (C.J.C. et al., manuscript in 
preparation). Both reactions are expected to populate the same set of excited states. 
However, in normal kinematics, with a light beam (7Li) on a heavy target (°°Zr), 
the kinetic energy of the resulting °*Mo recoil would be too low for NEEC to 
occur. The analysis of that dataset demonstrated that under those conditions the 
2,475-keV ~-ray was not coincident with either the 268-keV or the 1,478-keV 
transitions in °°Mo or in any other nuclide produced in the ’Li + °°Zr reaction. 
Similarly, reactions between the °Zr projectiles and the C in our layered target were 
ruled out as the origin of the observed coincidences by examining the C + °°Zr 
reaction (C.J.C. et al., manuscript in preparation). Thus, the observed +-ray 
sequence 2,475 keV-268 keV-685 keV-1,478 keV did not originate from contam- 
inant fusion-evaporation reactions involving either Li or C. (As noted earlier, 
reactions between *°Zr and the Pb stopper are expected to yield only a couple 
of -\-rays with well known energies following Coulomb excitation.) Although 
these measurements were performed at a different facility from the NEEC experi- 
ment at ATLAS, the detector array at ANU is sufficiently sensitive to detect such 


coincidences had they existed in normal kinematics, given the excitation proba- 
bility deduced from the ATLAS experiment. 

Coulomb excitation and inelastic scattering of "Mo are possible alternative 
mechanisms by which the nucleus could be excited from the isomer to the interme- 
diate state (as was observed for ™Cu in ref. 12). We used the codes GOSIA”® and 
RACHEL” to calculate the expected yield of 268-keV ~-rays from the intermediate 
state as a fraction of the total number of °™Mo recoils incident on ?°°Pb. GOSIA 
requires electromagnetic transition strengths as input. Although few transition 
strengths in °°Mo have been experimentally determined, several could be estimated 
from the analogous transitions in "Mo owing to the close structural similarity 
of these two isotopes”. For the 4.85(9)-keV, 21/2+ — (17/2)* E2 excitation", 
a reduced transition strength of B(E2) =72e* fm‘, where e is the charge of the 
electron, can be obtained from the shell-model estimate of ref. 33, which is about 
two times larger than the strength** B(E2; 8 — 6*) of the analogous transition 
measured in °*Mo. Weaker (or unobserved) transitions between other pairs of 
states were assumed to have smaller transition rates than those noted above. With 
these assumptions, we determine the probability of exciting the isomer to the 
intermediate state via Coulomb excitation in 7°’Pb to be approximately 3 x 10°. 
This probability is dominated by the direct transition between the two states, with 
minimal impact from multistep excitations involving other states, and repeating 
the calculation with a drastically simplified level scheme (containing only these two 
states plus those below the isomer) yields comparable results. Additionally, even 
by increasing the strength B(E2; 21/2* — (17/2)*) by 100 times, which exceeds 
the recommended upper limit®, the resulting probability of 2 x 10~‘is still much 
lower than the experimental P.., = 0.010(3). 

We also used FRESCO” to calculate the expected cross-sections for the excita- 
tion of °™Mo to the intermediate state via above-barrier inelastic scattering 
in Li and in C. Using the GOSIA findings as a starting point, a simplified level 
scheme with only direct excitation of the intermediate state was assumed to be 
sufficient. The reduced transition strength of 72e” fm* was again used. Extended 
Data Fig. 4 shows the cross-sections as a function of °°Mo recoil energy in both 
target materials, as calculated by FRESCO. Taking the average cross-section in 
each medium separately, the corresponding probabilities for the 21/2* — (17/2)* 
excitation are 6 x 107° in Li, assuming °3Mo production at the centre of the target, 
and 2 x 10° over the full thickness of C. 

We note that GOSIA and FRESCO are well established reaction codes validated 
by substantial experimental data. While various NEEC models have existed for 
several decades, there have been no prior non-null experimental results to test 
their reliability. 

Energy and efficiency calibrations. We performed both energy and relative- 
efficiency calibrations of the Gammasphere Ge detectors using Eu and °°Co 
calibration sources. Each source was positioned in the target location, at the 
centre of Gammasphere. The energies and intensities of the strongest ~\-rays that 
are emitted after the decay of these nuclides, which span from 122 keV to 1,408 keV 
(457Eu) and 847 keV to 3,451 keV (°°Co), are well known. We fitted the peak 
centroids and areas and compared them with these known energies and intensi- 
ties, respectively, using the RadWare suite of analysis codes (http://radware.phy. 
ornl.gov). The data from both sources were combined for the energy calibration 
and were simultaneously fitted with a second-order polynomial. The detection 
efficiencies of the Ge detectors had a more complex energy dependence” and the 
two sources were treated separately, as their absolute activities were not precisely 
known. We performed a preliminary fit of the !°’Eu lines and then normalized the 
°°Co relative efficiencies to those of '**Eu in the region where the energies for the 
two sources overlap to obtain an efficiency curve for the full 0.1-3.4-MeV energy 
range. This procedure provided relative efficiencies, but these are sufficient because 
our analyses use only efficiency ratios. 

Uncertainties. The uncertainty on the counts in each energy bin of a spectrum 
double-gated on a peak or the background was taken to be the square root of 
the number of counts in that energy bin. For a background-subtracted spectrum 
formed by a linear combination of the component spectra (equations (1) or (2)), 
the uncertainty was that of the individual components added in quadrature. These 
uncertainties were used as statistical weights for each energy bin in the fits to the 
peak areas. 

The systematic errors associated with the spectra in Fig. 2a, c were taken to 
be the uncertainties in kaye, as determined from the known background 4-rays 
in each spectrum. We could then define the total uncertainty o as the statistical 
and systematic errors added in quadrature. For example, the weighted average 
of k= 1.33(7) and 1.21(7) for the 268- and 685-keV peaks in the double gate on 
the 2,475- and 1,478-keV peaks (Fig. 2a) is kyggc= 1.27 with statistical uncer- 
tainty 0.05, while the systematic uncertainty on kaye= 0.99 is 0.08. The total 
uncertainty is c= (0.057 + 0.087)? = 0.09, and thus the peaks from NEEC are 
kygrc — kave= 0.28 = 30 above background (compared to a level of about 7c from 
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a purely statistical uncertainty). A similar analysis for the double gate on the 2,475- 
and 268-keV peaks (Fig. 2c) also produces a 3a result. 

Code and data availability. The 7-TB dataset generated and analysed during this 
study is available from the corresponding author on reasonable request. The codes 
used in the analysis, aside from those cited in the references, are also available. 


28. Tarasov, O. B. & Bazin, D. Development of the program LISE: application to 
fusion-evaporation. Nucl. Instrum. Methods B 204, 174-178 (2003). 

29. McCutchan, E. A, Lister, C. J. & Greene, J. P. A target vacuum interlock system 
for Gammasphere. Nucl. Instrum. Methods A 607, 564-567 (2009). 

30. Radford, D. C. Background subtraction from in-beam HPGe coincidence data 
sets. Nucl. Instrum. Methods A 361, 306-316 (1995). 


LETTER 


. Palffy, A, Harman, Z. & Scheid, W. Quantum interference between nuclear 


excitation by electron capture and radiative recombination. Phys. Rev. A 75, 
012709 (2007). 


. Fukuchi, T. et al. High-spin isomer in 98Mo. Eur. Phys. J. A 24, 249-257 (2005). 
. Hasegawa, M., Sun, Y., Tazaki, S., Kaneko, K. & Mizusaki, T. Characteristics of the 


21/2* isomer in 9°Mo: toward the possibility of enhanced nuclear isomer 
decay. Phys. Lett. B 696, 197-200 (2011). 


. Baglin, C. M. Nuclear data sheets for A=92. Nucl. Data Sheets 113, 


2187-2389 (2012). 


. Firestone, R. B. et al. (eds) Table of Isotopes 8th edn, Vol. Il John Wiley & Sons, 


1996). 


. Radford, D. C. ESCL8R and LEVIT8R: software for interactive graphical analysis 


of HPGe coincidence data sets. Nucl. Instrum. Methods A 361, 297-305 (1995). 


© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


LETTER 


C, 0.50 mg/cm? 


~3mm 


33 mg/cm? 


4.2 mg/cm? 1.55 mg/cm? 


Extended Data Figure 1 | Configuration of the target used in the 
experiment. a, Schematic of the target construction showing the layers in 
which **Mo production occurs (Li), NEEC can occur (C) and the backing 
that stops all recoils (?°8Pb), in addition to the important gap of about 
3mm that is needed to accommodate the effective half-life for the decay 


of the 4,900-keV level. Relative dimensions are not to scale. The beam is 
incident on the Li surface. b, Photograph of the target positioned inside 
the Gammasphere target chamber. The beam enters from the lower right 
side and is parallel to the double rods shown in the upper left part of the 
photograph. 
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Extended Data Figure 2 | Spectra showing the line shape of the 
2,475-keV transition in °*Mo. a, b, The spectra are from the detectors 
in ring 9 of Gammasphere at 90° (a) and ring 7 at 79° (b). The spectra in 
red were recorded while using a Li target backed with 7°8Pb and with no 
gap in between. The blue spectra were obtained with a modified target 


configuration with a gap of about 3 mm. The 2,361-keV peak corresponds 
to a transition in **Mo that lies below a level with a half-life of t)/2 = 35 ps. 
The similar line shapes of these two transitions support the estimate of a 
delay of tens of picoseconds in the 2,475-keV emission and therefore the 
rationale for the final target construction. 
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Extended Data Figure 3 | Spectra used to determine background °3Mo relevant to the discussion are labelled, with the dashed lines marking 
contributions. Component spectra for the double gate on the Doppler- their energies. We note that the 770-keV peak in a is a multiplet with the 
shifted 2,475-keV +-ray (1) and the unshifted 1,478-keV ~+-ray (2), where 773- and 777-keV transitions in ?*Mo and ?’Ru, respectively; only the last 
gates on the peak and background regions are denoted as ‘g’ and ‘b;, two peaks appear in b and d. 


respectively (see text). a, 219. b, gib2. ¢, bigo. d, b)b2. Only those 4-rays in 
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Extended Data Figure 4 | Calculations of possible competing processes. _ the energy of recoiling Mo ions traversing the ’Li (blue) and °C (red) 
The inelastic-scattering cross-sections for exciting **"Mo to the target layers. The initial energy is the average recoil energy corresponding 
intermediate state, calculated with the code FRESCO, are plotted versus to °Mo production at the centre of the Li target. 
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Experimental observation of Bethe strings 


Zhe Wang!”, Jianda Wu*+, Wang Yang, Anup Kumar Bera*°, Dmytro Kamenskyi®, A. T. M. Nazmul Islam‘, Shenglong Xu’, 


Joseph Matthew Law’, Bella Lake**®, Congjun Wu? & Alois Loidl! 


Almost a century ago, string states—complex bound states of 
magnetic excitations—were predicted to exist in one-dimensional 
quantum magnets!. However, despite many theoretical studies”"!, 
the experimental realization and identification of string states in 
a condensed-matter system have yet to be achieved. Here we use 
high-resolution terahertz spectroscopy to resolve string states in 
the antiferromagnetic Heisenberg-—Ising chain SrCo,V2Os in strong 
longitudinal magnetic fields. In the field-induced quantum-critical 
regime, we identify strings and fractional magnetic excitations 
that are accurately described by the Bethe ansatz'**. Close to 
quantum criticality, the string excitations govern the quantum spin 
dynamics, whereas the fractional excitations, which are dominant 
at low energies, reflect the antiferromagnetic quantum fluctuations. 
Today, Bethe’s result’ is important not only in the field of quantum 
magnetism but also more broadly, including in the study of cold 
atoms and in string theory; hence, we anticipate that our work will 
shed light on the study of complex many-body systems in general. 
Magnons are elementary quasiparticle excitations above the ground 
state in ferromagnets, which govern the low-temperature thermo- 
dynamics‘, For excited states with two or more magnons, a description 
in terms of free quasiparticles is very incomplete, especially in one and 
two dimensions, because the magnons can form bound states that share 
centre-of-mass momenta owing to the exchange interactions’”. For a 
one-dimensional system, bound states of magnons can be viewed as 
magnetic solitons}? in the classical limit, which correspond to strings 
of flipped spins that exist as bound entities in the chain. Studying 
the dynamical properties of interacting magnetic excitations is of 
interest not only because of the potential applications in quantum 
information", but also because it could provide insight into fundamen- 
tal aspects of quantum magnetism and quantum many-body systems"». 
In the one-dimensional spin- 1/2 Heisenberg model—a paradigmatic 
model of interacting spin systems—the existence of bound states was 
first predicted in the early 1930s by Bethe for two magnons!. The 
systematic ansatz introduced by Bethe for calculating the eigenvalues 
and eigenstates of the Heisenberg model exactly was later generalized 
to the description of multi-magnon bound states—the so-called Bethe 
strings—in models beyond the isotropic limit**+°*”. It is generally 
believed that spin dynamics is governed by low-energy multi-particle 
excitations”! !°; however, the excitations of two-magnon bound 
states (two-string states) have recently been theoretically suggested to 
be dominant in the isotropic Heisenberg antiferromagnet’. 
Nevertheless, the string excitations are sensitive to exchange anisotropy: 
for an easy-plane anisotropy, although the spin excitations remain gap- 
less, the dynamical response of string states is substantially smaller 
compared with the fractional multi-particle excitations (spinons)®”. 
In addition, in a spin-gapped Heisenberg-Ising antiferromagnet with 
easy-axis anisotropy, the dynamical properties are dominated by the 
fractional spinon excitations'®°. Hence, experimentally realizing string 
states is very difficult, and has yet to be achieved in condensed-matter 


systems. Here, we perform terahertz spectroscopy on the one-dimen- 
sional Heisenberg-Ising antiferromagnetic system SrCo2V Og in 
longitudinal magnetic fields. We show that when the spin gap is closed 
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Figure 1 | Quantum spin chain in SrCo2V2Os, psinon-(anti)psinon 
pairs and strings. Psinon-(anti)psinon pairs and strings are the 
characteristic magnetic excitations in one dimension in the critical regime. 
a, Chain structure of SrCo2V2Og with a four-fold screw axis along the 

c direction. At 1.7 K, Néel-ordered (Sf = 0) and field-polarized (S} = N/2) 
states are stabilized for longitudinal magnetic fields B< B.=4 T and 

B> B,= 28.7 T, respectively, where the spins are represented by the arrows. 
b, A representative configuration of the ground state in the critical regime 
(B.< B<B,) for total spin-z quantum number Sj = N/2 — r with r flipped 
spins with respect to the fully polarized state. c—f, Excitations above the 
ground state are allowed by the selection rules AS} = +1, yielding psinon- 
psinon pairs (c), and AS} = — 1, yielding psinon-antipsinon pairs (d), 
two-string states (e) or three-string states (f), which govern the interaction 
with the magnetic field of a photon. Whereas the psinon-(anti)psinon 
pairs can propagate throughout the chain without forming bound states, 
the two-string and three-string states (bound states formed by two and 
three magnons, respectively; circled) move as entities in the chain. The 
flipped spin with respect to the ground state (b) is indicated by a wiggly 
line for each excited state (c-f). 
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Magnetic field, B (T) 
Figure 2 | Softening of spinons and emergent magnetic excitations at 
the quantum phase transition in SrCo2V2Os. a, Transmission spectra of 
magnetic excitations in SrCoV2Os for various frequencies below 1 THz, 
measured with the applied longitudinal magnetic field B || c and the 
electromagnetic wave propagating along the c axis. Magnetic resonance 
excitations corresponding to transmission minima are indicated by arrows. 
b, Eigenfrequencies of the resonance modes as a function of the applied 
magnetic field. For B < B,, a series of confined fractional spinon 


above a field-induced quantum phase transition at B. = 4 T, many-body 
two-string and three-string states are identified in the quantum-criti- 
cal regime before a fully field-polarized state is reached at B, = 28.7 T. 
On decreasing the magnetic field from B,, the dominant role of the 
low-energy fractional multi-particles in the dynamical response is 
gradually taken over by the string states, which govern the quantum 
spin dynamics close to the quantum phase transition. 

The realization of the one-dimensional Heisenberg-Ising model 
in SrCo2V2Og is based on its crystal structure and the dominant 
nearest-neighbour antiferromagnetic interactions (Fig. 1a). The screw 
chains of CoOg octahedra, with the four-fold screw axis running along 
the crystallographic c direction, are arranged in a tetragonal structure. 
Owing to spin-orbit coupling, the atomic magnetic moments of the 
Co?* ions, comprising the spin and orbital degrees of freedom, are 
exposed to an Ising anisotropy. The crystal electric field in the CoO. 
octahedra lifts the twelve-fold degeneracy of the Co”* moments, which 
results in a Kramers doublet ground state with a total angular momen- 
tum of 1/2 (ref. 21). Magnetization and neutron diffraction experiments 
reveal that the Ising anisotropy forces the atomic magnetic moments 
along the c axis, with Néel-type collinear antiferromagnetic order 
stabilized below Ty=5 K (ref. 22). 

Superexchange interactions between the magnetic moments in the 
chains of SrCo2V Og are described by the Hamiltonian of the one- 
dimensional spin-1/2 Heisenberg—Ising model”***: 


N N 
H=J > [(SrSi 41+ S782 41) + ASESi A] — SjlnB bes 
n=1 n=1 


where J > 0 is the antiferromagnetic coupling between neighbouring 
spins, A > 1 accounts for the Ising anisotropy between the longitudinal 
and transverse spin couplings, $;”* are the spin components at the nth 
site and N is the length of the chain. The last term corresponds to the 
Zeeman interaction in a longitudinal magnetic field B along the c axis 
with g-factor g and Bohr magneton {ug. 

The Néel ground state at zero field can be illustrated by an anti- 
parallel alignment of neighbouring spins, corresponding to a total 
spin-z quantum number of $4.= 0 (Fig. 1a). One spin flip creates an 
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excitations split in low fields (1+, 2+, ...) and follow a linear field 
dependence (dashed lines). Above B., new modes emerge (Ro, ae? and 
he with completely different field dependencies. Deviations from the 
linear field dependence appear when approaching the critical field. Error 
bars indicate the resonance line widths. In both panels, the field-induced 
phase transition from the Néel-ordered phase to the critical regime is 
indicated by the vertical dashed line at the critical field B, = 4 T. 


excitation of two spinons”’, which can propagate separately along the 
chain via subsequent spin flips. In momentum space the spinons form 
a two-particle excitation continuum!*? that is gapped above the anti- 
ferromagnetic ground state. 

As a result of the Zeeman interaction, a longitudinal magnetic field 
can reduce and finally close the spin gap at a critical field B, (ref. 2). 
Before reaching the fully polarized state (B > B, Fig. 1a) in which the 
elementary excitations are gapped magnons, the system enters a gapless 
phase that corresponds to the critical regime (B.< B< B,). A general 
ground state in this regime with an arbitrary value of S is illustrated 
in Fig. 1b, and fundamentally new and exotic states can be excited by 
flipping a single spin (Fig. 1c-f). According to the Bethe ansatz’, the 
spin excitations in the critical regime can be bound states of n magnons 
(n-string states)** or low-energy spinon-like quasiparticles®”~'°. The 
spinon-like quasiparticles, which form multi-particle continua” 
similarly to spinons (see Methods), are named psinons or antipsinons”*® 
in the context of the Bethe ansatz to distinguish them from the spinons 
in zero field. We adopt this nomenclature in the following for this 
reason and, more importantly, because our results reveal that the excita- 
tions of psinon-psinon and psinon-antipsinon pairs obey different 
selection rules (Fig. 1c, d). 

Because excitations with AS}. = + land AS;.= — Lare both allowed 
by the selection rules that govern the interaction with the magnetic field 
of photons or with the magnetic moment of neutrons, the excited states 
illustrated in Fig. 1c-f should be observable in optical or neutron- 
scattering experiments. We use terahertz optical spectroscopy on the 
Heisenberg-Ising antiferromagnetic chain SrCo,V Os in a longitudinal 
magnetic field up to 30 T. Our results provides clear experimental evidence 
for the existence of the many-body two-string and three-string states and 
of the fractional multi-particle excitations that characterize the quantum 
spin dynamics of the one-dimensional spin-1/2 Heisenberg-—Ising model. 

In a longitudinal magnetic field, 7 is a good quantum number. The 
eigenstates of the Heisenberg-Ising model can therefore be classified 
accordingly, and described by a general Bethe-ansatz wavefunction 


y= 


1l<nj<--+<n,<N 


a(n, No, soy My) |My N2, wy) 
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Figure 3 | Absorption spectra of psinon-psinon, psinon-antipsinon, 
two-string and three-string excitations for B. << B < B, and of magnons 
for B > B, in SrCo2V2Os. a, b, Absorption spectra for various longitudinal 
magnetic fields in the critical regime in the low-energy (a) and high- 
energy (b) spectral ranges. a, Four types of excitation—Ro, Rx/2; a and 
eo, each with a characteristic field dependence—are observed in the 
critical regime (B. < B < B,). Mode Ro evolves from mode M, above B,. 
Whereas mode R,/2 softens with increasing fields, the eigenenergies of the 
other modes increase. b, A higher-energy mode eh) can be resolved at 
relatively low magnetic fields. The spectra are shifted upwards 
proportional to the corresponding magnetic fields. 


for a total spin-z quantum number S}=N/2—r (Fig. 1b), with N 
denoting the length of the chain and with r flipped spins (at sites 
14, Mz, ..., 1, in the chain) with respect to the fully spin-polarized state 
|--- > —+—---), that is,|n,, 72, ...,0,) =S, S, 


of. Pare 
where $* = $* +iS? are the operators that flip the spin of site n;. We 
J J 


Gs he ey 
ny 


use the Bethe ansatz to obtain the coefficients a(n, m2, ..., n,) and the 
eigenenergies of the ground state and the excited states for every S} (see 
Methods). The excited states of psinon-(anti)psinon pairs and of 
n-strings correspond to the real and complex momenta in the solutions 
of the Bethe-ansatz equations!” 468-1] and are labelled as R,and x, 


respectively, with the subscript indexing the corresponding transfer 
momenta. The excitations that are allowed in optical experiments obey 
the selection rules AS}, = +1 or AS. = —1. These excitations contribute 
to the dynamic structure factors S~*(q, w) or S*~(q, w), respectively, 
defined by 


2 
<6) 5(w —E, + Eg) 


Si 


S%(q,w)=7)> [(v 
rm 


in which 4 = — a with a € {+, —}, |G) and |) are the ground and 
excited states, with eigenenergies of Eg and E,,, respectively, and 


1 : 
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Figure 4 | Magnetic excitations in the longitudinal-field Heisenberg- 
Ising chain SrCo,V2Os. Eigenfrequencies are shown as a function of 
longitudinal magnetic field for all magnetic excitations observed 
experimentally (symbols). Below B,=4 T, confined spinons are observed 
in the Néel-ordered phase. In the critical regime (B- < B < Bs), excitations 
of psinon-psinon pairs (R,/2 at q= 1/2), psinon-antipsinon pairs (Ro at 
q=0), and of complex many-body two-string Oe and x at q=0 and 
q=T, respectively) and three-string Oe), at q= 7/2) bound states are 
identified by the field dependencies of their eigenfrequencies. Above 

B, = 28.7 T, magnons (My at q= 0) are observed in the field-polarized 
ferromagnetic phase. Solid lines display the results of the dynamic 
structure factors S~*(q, w) and S*~(q, w) at q=0, q= 7/2 andg=n 

for the corresponding excitations of the one-dimensional spin-1/2 
antiferromagnetic Heisenberg-Ising model. For all five modes (Ro, Rx 
a. ; ye and 4)» excellent agreement between theory and experiment is 
achieved using an exchange interaction J = 3.55 meV, g-factor gi = 6.2 and 
Ising anisotropy A =2 (ref. 24). Experimental and theoretical line widths 
are indicated by error bars and shading, respectively. 


Hence, we can quantitatively attribute the contributions of string excita- 
tions and psinon-(anti)psinon pairs to the relevant dynamic structure 
factors. The string excitations with higher energies and characteristic 
field dependencies can readily be distinguished from the low-energy 
psinon-(anti)psinon pairs (see Methods). This enables us to compare 
the theoretical calculations to the experimental results precisely as a 
function of the longitudinal field and to identify the nature of each 
observed mode. 

In Fig. 2a we show transmission spectra at various frequencies below 
1 THz asa function of longitudinal magnetic field. At 0.195 THz, two 
transmission minima are observed, at 2.41 T (mode 1—) and 6.18 T 
(mode Ro), below and above the critical field B.=4 T. With increasing 
magnetic fields, mode 1— shifts to lower frequencies and mode Ro to 
higher frequencies. Mode 1— together with modes 1+, 2—, 2+ and 3-, 
which are observed at higher frequencies (0.39 THz and 0.59 THz), are 
known as confined spinon excitations”? owing to the inter-chain 
couplings in the gapped Néel-ordered phase (B < B.). Well below the 
critical field B. (Fig. 2b) the confined spinons exhibit Zeeman splitting 
with linear field dependence”’. Close to the critical field mode 
1— softens, concomitant with a substantial hardening of mode 1+. This 
indicates that the inter-chain couplings are suppressed above B, and 
that the system enters the field-induced critical regime in one 
dimension. Completely different excitation spectra appear in the 
critical regime: in the same frequency range (Fig. 2a), we unambigu- 


ously observe three sharp modes, denoted by Ro, xO and ie With 


increasing magnetic field well above the critical field B,, the eigenen- 
ergies of the three modes increase linearly with different slopes 
(Fig. 2b). 

Using magneto-optic spectroscopy in a high-field laboratory, we are 
able to extend the search for magnetic excitations to a much larger 
spectral range and to higher magnetic fields up to 30 T, covering the 
complete critical regime (B,< B < B,) and the field-polarized ferromag- 
netic phase (B > B, = 28.7 T). As displayed in Fig. 3, the magnetic 
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excitations are represented by the peaks in the absorption-coefficient 
spectra at various magnetic fields. At 10 T, we identify not only modes 
Ro, x? and a at 1.66 meV, 3.67 meV and 4.23 meV, respectively, ina 
sequence of increasing energies, but also a higher-energy mode R,,2 at 
5.82 meV. Whereas TOs Ro and x? ) have comparable absorption 
coefficients, mode 1? ) is much weaker, which is consistent with the 
low-field measurements (such as the 0.59-THz spectrum in Fig. 2a). In 
higher magnetic fields, mode R,/2 softens, whereas the other modes 
(Ro xO and ur) shift to higher energies. Above the low-energy 
phonon bands (see Methods), we resolve a further high-energy 
magnetic excitation ve) , of 14.6 meV at 13 T, which shifts to higher 
energies with increasing magnetic field (Fig. 3b). The field dependence 


of the eigenfrequencies of the five observed modes (Ro, Rx/2, Peas x 


and rae ) is summarized in Fig. 4 (symbols). The field dependencies 
are linear for all of the modes, each with distinct slope and character- 
istic energy. 

From the Bethe-ansatz calculations®, we can single out various excita- 
tions and evaluate their respective contributions to the dynamic 
structure factors S~ *(q, w) and S*~(q, w) (Methods). In accord with 
Brillouin-zone folding due to the four-fold screw-axis symmetry of the 
spin chain?*4, in Fig. 4 we show the peak frequencies in S~ *(q, w) and 
S*~(q, w) as a function of magnetic field (solid lines) for the transfer 
momenta q=0, q= 1/2 and q=T, so as to compare to the terahertz 
spectroscopic results. Excellent agreement between theory and experi- 
ment is achieved for all five distinct magnetic excitations, which enables 
us to unambiguously identify their nature: Ro characterizes psinon- 
antipsinon pairs at q=0, whereas R, 2 characterizes psinon-psinon 
pairs at q= 1/2. The psinon-antipsinon excitations Ro, related to the 
single spin-flip operator S— q-o evolve from the magnon mode Mp in the 
field-polarized ferromagnetic phase, in which the largest absorption is 
observed experimentally (Figs 3a, 4). Most strikingly, we are able to 
detect and identify the two-string states oy and ay at q=Oand q=7, 


respectively, and the three-string states cee for q= 7/2. 

The branch of psinon-psinon pairs R,/2 belongs to S~*(q, w) and 
obeys the selection rule AS}, = + 1. Hence, the psinon-psinon excita- 
tions correspond to flipping one spin into the direction of the magnetic 
field (Fig. 1c), which will decrease the Zeeman energy, so that mode 
R,/2 softens with increasing field. By contrast, the psinon—antipsinon 
pairs and the string states, which correspond to S*(q, w), obey the 
selection rule AS} = — 1(Fig. 1d-f) and so their eigenenergies increase 
with magnetic field. The linear dependencies, which arise essentially 
from the linear dependence of the Zeeman energy on magnetic field, 
are substantially renormalized as a result of the one-dimensional many- 
body interactions. 

The observation of three-string states reflects a very peculiar feature 
of the one-dimensional Heisenberg-Ising model: close to quantum 
criticality, even three magnons can form a stable bound state and, more 
surprisingly, the bound states of three magnons govern the dynamical 
response”®. This is in clear contrast to the isotropic Heisenberg model, 
in which the two-string states dominate"®, or to the models with easy- 
plane anisotropy, in which the fractional multi-particles essentially 
characterize the dynamical properties*’. Besides their eigenfrequen- 
cies, the contribution of the string states to the spin dynamics is also 
strongly field-dependent®'°”*, Starting from the quantum phase tran- 
sition of the Heisenberg-Ising model”, an increase in magnetic field 
leads to a decreasing contribution of the string excitations. Above the 
half-saturated magnetization, the low-energy multi-particle excita- 
tions become dominant, finally governing the spin dynamics in the 
fully field-polarized limit (Methods). This is manifested by the rapidly 
increasing absorption of mode Rg (see Fig. 3). 

We have identified many-body two-string and three-string states in 
the quantum-critical regime of a one-dimensional spin-1/2 Heisenberg- 
Ising chain. This represents an example of the experimental realization 
of strongly correlated quantum states in condensed-matter systems”. 
Further dynamical properties of the string states are expected to be 
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revealed from inelastic neutron-scattering studies, which can probe 
the whole Brillouin zone also for the excitation continua!®'>”°, thus 
allowing a more detailed comparison to theory. The stability of the 
string states, as indicated by our results, provides the possibility to study 
their non-equilibrium behaviour in quantum magnets”® and cold-atom 
lattices”®. Thus, our results pave the way towards the deterministic 
manipulation of complex magnetic many-body states in solid-state 
materials and shed light on the study of quantum quench dynamics'! 

the Hubbard model”, and string excitations in string theory’>*’. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Sample preparation. High-quality single crystals of SrCo2V2Og were grown 
using the floating-zone method. Crystal structure and magnetic properties 
were characterized by X-ray diffraction, neutron diffraction and magnetization 
measurements”. For the optical experiments, single crystals were oriented using 
X-ray Laue diffraction and cut perpendicular to the tetragonal c axis with a typical 
surface area of 4mm x 4mm and a thickness of 1 mm. 

Terahertz spectroscopy in magnetic fields. Low-frequency optical experiments 
were carried out in Augsburg. For the transmission spectroscopy below 1 THz, 
backward wave oscillators were used as tunable sources of electromagnetic waves. 
A magneto-optic cryostat (Oxford Instruments/Spectromag) was used to apply 
external magnetic fields up to 7 T and to control temperatures down to 2 K. High- 
field optical measurements were performed in the High Field Magnet Laboratory 
in Nijmegen. Transmission spectra were measured using a Fourier-transform 
spectrometer Bruker IFS-113v, combined with a 30-tesla Bitter electromagnet. 
Terahertz electromagnetic waves were generated by a Mercury lamp and detected 
bya silicon bolometer. For all optical measurements, the external magnetic fields 
were applied parallel to the crystallographic c axis (longitudinal field B || c) and 
to the propagation direction of the electromagnetic wave (Faraday configuration 
B\|k). 

Crystal and magnetic structure of SrCo2V2Og. The spin-1/2 Heisenberg- 
Ising antiferromagnetic chain system SrCo2V2Og crystallizes in a tetragonal 
structure with space group I4;cd (Extended Data Fig. 1) and lattice constants 
a=12.2710(1) A and c=8.4192(1) A at room temperature. The screw-chain 
structure is based on edge-shared CoOg octahedra. The screw axis of every chain 
is along the crystallographic c direction with a period of four Co”* ions (Extended 
Data Fig. 1a). In each unit cell, there are four chains, two with left-handed screw 
axes and two with right-handed ones (Extended Data Fig. 1b). The leading 
inter-chain couplings are between Co”* ions from the neighbouring chains with 
the same chirality’, as indicated in Extended Data Fig. 1b. Compared to the 
intra-chain interaction, the inter-chain coupling is almost negligible, J /J< 10-2 
(refs 32, 33). Owing to crystal-field effects and spin-orbit coupling, the atomic 
magnetic moments of each Co* ion form a ground state of Kramers doublets 
corresponding to the total angular momentum of 1/2, which comprises spin and 
orbital degrees of freedom. According to crystal-field theory”!** and electron 
spin-resonance measurements”, the magnetic gap between the two lower-lying 
Kramers doublets is about 22 meV. The nearest-neighbour exchange interactions 
between the Kramers doublets in the ground state (corresponding to my=+1/2) 
can be described by the spin-1/2 Heisenberg-Ising antiferromagnetic model. Below 
Tn 5K, a Néel-ordered phase is stabilized in zero magnetic field??. 

High-field magnetization in SrCo,V203. Magnetization measurements were 
performed at the Dresden High Magnetic Field Laboratory in a pulsed magnetic 
field up to 60 T. Extended Data Fig. 2a shows the magnetization in a longitudinal 
field B || c at 1.7 K. The magnetization curve exhibits distinct features in different 
phases and a clear signature of a quantum phase transition. At zero magnetic field, 
the Néel-ordered phase is characterized by gapped fractional spinon excitations. 
A clear onset of magnetization occurs at B.=4 T, indicating the phase transition 
to the one-dimensional gapless phase (critical regime). Slightly above B,, the mag- 
netization increases quasi-linearly, and then strongly nonlinearly above 10 T. This 
behaviour is also reflected in the field derivative of the magnetization, dM/dH: a 
nearly constant value is followed by an evident upturn at higher fields (Extended 
Data Fig. 2b). Field-induced phase transitions at B, and B, are evidenced by the 
anomalies in the field derivative of magnetization. Above B, = 28.7 T, the spins are 
fully field-polarized and the system enters a field-induced ferromagnetic phase 
with saturated magnetization. We use the spin-1/2 Heisenberg-Ising antiferromag- 
netic model (equation (1)) to simulate the field dependence of the magnetization 
in the gapless one-dimensional critical regime between B, and B, (ref. 2). Using the 
same parameters as determined from the excitation spectra (J= 3.55 meV, gj = 6.2 
and A = 2; Fig. 4), we can describe the experimental magnetization curve quite 
well. 

Low-energy phonon spectrum in SrCo,V2Ox. The low-energy phonon reflection 
spectrum was measured using Fourier-transform infrared spectroscopy. Extended 
Data Fig. 3 shows the phonon spectra of SrCo2V2Og measured for the polariza- 
tion E* || ain the relevant spectral range. The strong reflection due to phonon 
excitations from 8 meV to 13.5meV strongly reduces the transmission at the 
corresponding spectral range. 

Bethe-ansatz formalism. We use the Bethe ansatz to obtain the eigenenergies 
and eigenwavefunctions of the one-dimensional spin-1/2 antiferromagnetic 
Heisenberg-Ising model: 


N N 
H(A.) =J S> ((S3S5% 41+ S952.) + ASSi 11 — BD Si (1) 


n=1 n=1 


in which A> 1 accounts for the Ising anisotropy in SrCo2VOs (refs 23, 24) and h is 
related to the external magnetic field B by h = gj/1gB. The Hamiltonian in equation (1) 
reduces to the isotropic Heisenberg model for A = 1, whereas for the model with 
easy-plane anisotropy, 0 < A <1. The dynamical properties of this model have 
been studied extensively in various regimes!"11°35-°8, Here, we follow an estab- 
lished theoretical approach®* to solve the Heisenberg-Ising model, and focus 
on the comparison with the experimental results in SrCo2V2Og. Details of the 
theoretical methods and results are presented in ref. 26. 

Starting from the fully polarized state as the reference state, we can divide the 
states into subspaces according to the number of flipped spins r, or equivalently 
the total spin in the z direction St = N/2 — r. The corresponding wavefunctions 
are obtained by solving the Bethe-ansatz equations*® 


NO\(Aj) = 2n1; t 'S 0(A; AD, j= | ee 
l=1 


in which 6;(A) and 6,(A), as functions of rapidity A, are defined by 


tan(A) 
tanh(nn/2) 
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0,(A) = rca 


where |A | denotes the floor function, which gives the greatest integer not larger 
than A. 

The Bethe quantum numbers {J}, € {1, ..., r}, take integer values when r is odd 
and half-integer values when r is even (see following sections). A state is called a 
real Bethe eigenstate (psinon or antipsinon) if all of the rapidities \; are real anda 
string state if there is a complex-valued j. 

A schematic of the distribution of ground-state Bethe quantum numbers is 
shown in Extended Data Fig. 4a for N= 32 and S$}. = 8. The low-lying excitations 
of real Bethe states can be classified according to either one of the two patterns that 
have purely real rapidities {Aj}, j € {1, ..., r} (refs 4, 6-10, 35): n-pair psinon—psinon 
states (ny) and n-pair psinon-antipsinon states (ny)"). The Bethe quantum 
numbers of an ny state can be produced by first extending the left (right) edge 
of the ground-state Bethe quantum numbers further to the left (right) by n, then 
removing 2n numbers within the extended range. Those of an nyu)" state can be 
obtained by removing n numbers in the ground-state range and putting them 
outside the range. Illustrations of the two situations are shown in Extended Data 
Fig. 4b and c for 1yy) and 1y", respectively. Extended Data Fig. 4d shows the 
Bethe quantum numbers of a string state, which has larger energy than the real 
Bethe states. 

Transverse dynamic structure factors. The dynamic structure factors (DSFs) 
relevant to the experiment are calculated for Ising anisotropy A= 2 and system 
size N= 200 using the determinant formulas in the algebraic Bethe-ansatz for- 
malism®. Extended Data Fig. 5 displays the results of S-+(q, w) and S*~(q, w) ata 
representative magnetization 2m = 0.4 (ref. 26). S- *(q, w) is dominated by gapless 
continua that are formed by 1zy and 2w excitations, whereas in St (q, w) there 
are several well-separated dynamical branches. For S*~ (q, w), the lowest-lying 
gapless continua are formed by luz)" and 22)" excitations. The corresponding 
spectral weights are located mainly in the energy range hw < 3J. The higher-energy 
separated continua correspond to two-string and three-string excitations that are 
found in the spectral ranges hw > 3J and hw > 5j, respectively. 

The momentum-integrated sum rules. By rescaling S+ to S+/./2, the 
momentum-integrated sum rules can be expressed as 


7 
1 S**(q, w) liom 
Raa= J dw=—+—<c, 
7 youd 2n 42° 


where c,= +1 for a= +, respectively. 
To evaluate the saturation levels of the sum rules, we define the ratio 


where R’,, is calculated from a partial summation over the selected excitations. 
Ratio of the momentum-integrated intensity of transverse DSFs. The ratios of 
momentum-integrated intensity for S*~ and St are displayed in Extended Data 
Fig. 6a and b, respectively, The calculations are carried out for 2m varied from 0.1 
to 0.9 with steps of 0.1 (ref. 26). The good saturations of sum rules (above 87%) 
for both S*~ and S~* over the entire range of magnetization indicate that most 
of the spectral weights are accounted for in the calculations. Particularly for St”, 
the string excitations, and especially the three-string states, become progressively 
important when magnetization is lowered. Hence, the string states have a 
predominant role in the dynamical properties in the low-magnetization region. 
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Comparison between theory and experiment. With the four-fold screw 
axis (Extended Data Fig. 1a), the Brillouin zone is folded by a factor of four in 
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evidently hardening and so can readily be identified. By contrast, the continua at 
q= 7/2 overlap with each other for different magnetic fields. Therefore, only at 


SrCo2V 20g; thus, as well as those at q=0, the excitations at q= 1 and q= 1/2 
should also be considered for comparison to the terahertz experiments. On the 
basis of the results shown in Extended Data Fig. 5, in the following we present a 
detailed analysis of the magnetic-field-dependent properties of the various excita- 
tions for these momenta. 

Extended Data Figs 7-10 show the DSFs for psinon-psinon, psinon-antipsinon, 
two-string and three-string excitations, respectively, as functions of energy for 
magnetizations 2m=0.1-0.9 and q=0, q= 1/2 and q=n. 

As shown in Extended Data Fig. 7, for the psinon-psinon excitations the DSF 
spectra exhibit very sharp peaks, and the peak positions shift with magnetic field 
(or magnetization) only for g= 1/2. By contrast, at q=0 and q= Ti the excita- 
tions form continua in the whole spectral range (2J), and we cannot define peak 
positions, or the peak positions are almost independent of magnetic field. Thus, 
we can clearly distinguish the psinon-psinon excitations at q= 1/2 from those at 
q=0and q=T. Moreover, the softening of the q= 1/2 modes upon increasing 
magnetic fields is in clear contrast to the hardening of the other types of excitation 
(see Fig. 4). This is because psinon-psinon excitations correspond to flipping one 
spin parallel to the magnetic field. Indeed, the psinon—psinon excitations at q= 1/2 
are observed by our terahertz spectroscopy (see Fig. 4). 

The psinon-antipsinon excitations exhibit very sharp peaks at q=0 and q= 7/2, 
but form continua at q= 1 (see Extended Data Fig. 8). At q=0, we can clearly see 
the peaks in the whole field range (Extended Data Fig. 8a), whereas for q= 1/2 
peaks are well resolved only above half-magnetization saturation (Extended Data 
Fig. 8b). For both of the modes, the peak positions shift to higher energy with 
increasing magnetic field. The psinon-antipsinon excitations at q=0 are observed 
experimentally for all magnetic fields above B.=4 T (see Fig. 4), whereas those at 
q=1/2, which appear above the field corresponding to half-saturated magnetiza- 
tion, By; = 25 T and at low energies, are yet to be found. The extended continua at 
q=T are strongly overlapping for different magnetic fields, and so they cannot be 
resolved by the present magneto-optical spectroscopy. 

Extended Data Fig. 9 shows the DSFs of two-string excitations. At q= the 
spectra exhibit sharp peaks with well-defined peak positions, whereas at q=0 the 
peaks are relatively broad. With increasing magnetic fields, these two modes are 


q=7 and q=0 can the two-string excitations be observed experimentally (see 
Fig. 4). 

The three-string excitations form extended continua at q=0 and q=1, which 
overlap for the different magnetic fields, as shown in Extended Data Fig. 10. By 
contrast, at q= 7/2 the excitation spectra exhibit well-defined peaks with peak 
positions shifting to higher energy with increasing magnetic field. This mode can 
clearly be identified experimentally (see Fig. 4). 

To summarize, only the excitations with well-defined peak positions that shift 
evidently with magnetic fields can be resolved, as have been observed for Ro, Rx/2s 


he iy and els by our magneto-optic terahertz spectroscopy (see Fig. 4). By 


contrast, the other excitations form continua in a broad spectral range and cannot 

easily be resolved by the terahertz spectroscopy. The theoretical peak positions of 
stati @) Q) 3 ; é 

the excitations Ro, Rxy2 X, ae a and rae are plotted as functions of magnetic field 


and compared to the experimental results. We fix the exchange interaction and 
Ising anisotropy to the previously determined values for SrCo:V2Og, J=3.55meV 
and A= 2 (ref. 24), and with g-factor gj = 6.2 all five experimentally observed 
modes can be very well described by the theory, as presented in Fig. 4. 

Data availability. The data that support the findings of this study are available 
from the corresponding author on reasonable request. 
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Extended Data Figure 1 | Crystal and magnetic structure of SrCo2V20s. _Intra-chain nearest-neighbour interaction is denoted by J. b, Viewing from 


a, The screw-chain structure consists of edge-shared CoOg octahedra. the c axis, each unit cell contains four screw chains with left- or right- 
Each chain has screw-axis symmetry with a period of four Co** ions handed screw axes. The leading inter-chain coupling J, is indicated, which 
(as numbered by the integers 1, 2, 3 and 4), corresponding to the lattice is between the Co*” ions in the same layer (denoted by the same integer 
constant along the c axis. The Néel-ordered phase is illustrated by as the Co site) and from chains with the same chirality. It is very small 
antiparallel arrows representing magnetic moments at the Co** sites. compared to the intra-chain interaction, J, /J< 10~? (refs 32, 33). 
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Extended Data Figure 2 | High-field magnetization and magnetic transition from the Néel-ordered phase to the critical phase is revealed by 
susceptibility of SrCo.V2Os. a, Magnetization M as a function of an the onset of magnetization and the peak in the susceptibility curve at the 
applied longitudinal magnetic field B along the Ising axis (B || c), measured __ critical field B. = 4 T. Saturated magnetization is observed above the field 
at 1.7 K (circles). Theoretical magnetization of the Heisenberg-Ising B,= 28.7 T and indicated by the sharp peak in the susceptibility. The small 
chain model is shown by the dashed line. b, Magnetic susceptibility anomaly at By; = 25 T seen in the susceptibility is close to the field of half- 


dM/dH as a function of the applied longitudinal field B. A quantum phase saturated magnetization. 
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Extended Data Figure 3 | Low-energy phonon spectrum of SrCo2V,Og. The phonon spectra of SrCo2V3Ox measured for the polarization E” || a at 5K. 
Strong reflectivity due to phonon excitations is observed in the spectral range 8-13.5 meV. 
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Extended Data Figure 4 | Schematics of patterns of Bethe quantum numbers. a, The ground state. b, One-pair psinon-psinon state 17. c, One-pair 
psinon-antipsinon state 1yv". d, Length-two string state 1yR. The system size is taken as N= 32 and the magnetization is Si = 8. 
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Extended Data Figure 5 | DSFs. a, b, S*~ (q, w) and S~*(q, w), psinon-psinon pairs in S~*). For S*~ (a), the higher-energy continua 
respectively, as functions of energy hw/J (vertical axis) and momentum correspond to excitations of two-string (Hw > 3/) and three-string 
q/x (horizontal axis) for 2m =0.4 and N= 200. The gapless continua are (hw > 5J) states. 


formed by real Bethe eigenstates (psinon-antipsinon pairs in St~ and 
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Extended Data Figure 6 | The momentum-integrated ratios. a, b, v._ lines are augmented by progressively taking into account the 2yy", two- 
for St~ and v_, for S~*, respectively, as functions of magnetization 2m. string and three-string contributions, respectively. In b, the blue and black 
Ina, the green line is the 1uu)" contribution. The blue, red and the black lines represent the lww and 1y~w + 2 contributions, respectively. 
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Extended Data Figure 7 | DSF of psinon-psinon pairs as a function of energy for 2m =0.1-0.9. a, q=0; b, g= 1/2; ¢, q=T. 
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Extended Data Figure 8 | DSF of psinon-antipsinon pairs as a function of energy for 2m =0.1-0.9. a, q=0; b, q= 7/2; ¢, q=T. 
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Extended Data Figure 9 | DSF factor of two-string states as a function of energy for 2m =0.1-0.9. a, q=0; b, q= 1/2; ¢, q=T. 
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Extended Data Figure 10 | DSF of three-string states as a function of energy for 2m =0.1-0.9. a, q=0; b, q= 1/2; ¢, q=T. 
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Processing bulk natural wood into a 
high-performance structural material 


Jianwei Song!*, Chaoji Chen!*, Shuze Zhu**, Mingwei Zhu'*, Jiaqi Dai!, Upamanyu Ray’, Yiju Li', Yudi Kuang!, Yongfeng Li', 
Nelson Quispe?, Yonggang Yao!, Amy Gong!, Ulrich H. Leiste*, Hugh A. Bruck’, J. Y. Zhu‘, Azhar Vellore®, Heng Li®, 
Marilyn L. Minus®, Zheng Jia?, Ashlie Martini®, Teng Li? & Liangbing Hu! 


Synthetic structural materials with exceptional mechanical 
performance suffer from either large weight and adverse 
environmental impact (for example, steels and alloys) or complex 
manufacturing processes and thus high cost (for example, 
polymer-based and biomimetic composites)'~*. Natural wood is a 
low-cost and abundant material and has been used for millennia 
as a structural material for building and furniture construction’. 
However, the mechanical performance of natural wood (its strength 
and toughness) is unsatisfactory for many advanced engineering 
structures and applications. Pre-treatment with steam, heat, 
ammonia or cold rolling!*! followed by densification has led 
to the enhanced mechanical performance of natural wood. 
However, the existing methods result in incomplete densification 
and lack dimensional stability, particularly in response to humid 


a Natural wood 


1. Chemical treatment 


2. Densification 


This work 


Specific strength (MPa cm? g-') & 


Densified wood 
(approximately 80% reduction in thickness) 


Cellulose nanofibre 


environments", and wood treated in these ways can expand and 
weaken. Here we report a simple and effective strategy to transform 
bulk natural wood directly into a high-performance structural 
material with a more than tenfold increase in strength, toughness 
and ballistic resistance and with greater dimensional stability. 
Our two-step process involves the partial removal of lignin and 
hemicellulose from the natural wood via a boiling process in an 
aqueous mixture of NaOH and Na,SO; followed by hot-pressing, 
leading to the total collapse of cell walls and the complete 
densification of the natural wood with highly aligned cellulose 
nanofibres. This strategy is shown to be universally effective for 
various species of wood. Our processed wood has a specific strength 
higher than that of most structural metals and alloys, making it a 
low-cost, high-performance, lightweight alternative. 


Figure 1 | Processing approach and 
mechanical performance of densified 
wood. a, Schematic of the top-down two-step 
approach to transforming bulk natural 

wood directly into super-strong and tough 
densified wood. Step 1, chemical treatment to 
partially remove lignin/hemicellulose; 

step 2, mechanical hot-pressing at 100°C, 
which leads to a reduction in thickness of 
about 80%. Most of the densified wood 
consists of well aligned cellulose nanofibres, 
which greatly enhance hydrogen bond 
formation among neighbouring nanofibres. 
b, Specific tensile strength of the resulting 
densified wood (422.2 + 36.3 MPa cm? g"!, 
mean + standard deviation) is shown to be 
higher than those of typical metals (the 
Fe-Al-Mn-C alloy, TRIPLEX and high- 
specific-strength steel, HSSS), and even of 
lightweight titanium alloy (Ti6AI4V). Error 
bars in Figs 1-4 and Extended Data Figs 1-10 
show standard deviation with n=5 repeats, 
unless noted otherwise. 


Cellulose molecular chain 


1Department of Materials Science and Engineering, University of Maryland, College Park, Maryland 20742, USA. @Department of Mechanical Engineering, University of Maryland, College Park, 
Maryland 20742, USA. *Department of Aerospace Engineering, University of Maryland, College Park, Maryland 20742, USA. “Forest Products Laboratory, USDA Forest Service, Madison, Wisconsin 
53726, USA. *Department of Mechanical Engineering, University of California Merced, Merced, California 95343, USA. °Department of Mechanical and Industrial Engineering, Northeastern 


University, Boston, Massachusetts 02115, USA. 
*These authors contributed equally to this work. 


224 | NATURE | VOL 554 | 8 FEBRUARY 2018 


© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


LETTER 


7 44.0% 


@§ Natural wood 
GS Densified wood 


Figure 2 | Structural characterization of natural wood and densified 
wood. a, Photograph of natural wood sample. b, Scanning electron 
microscopy (SEM) image of the natural wood sample perpendicular to 

the tree growth (L) direction, clearly showing the porous structure in 

the RT plane. c, SEM image of the natural wood sample in the RL plane, 
revealing the cross-section view of the lumina along the L direction 
(highlighted by dashed lines). d, Photograph of densified wood. e, SEM 
image of the densified wood in the RT plane, showing the fully collapsed 
lumina. The open spaces between the cell walls in natural wood are 
eliminated, resulting in a unique laminated structure with cell walls tightly 


Figure 1a shows a schematic of our top-down two-step approach to 
directly transforming bulk natural wood. Our approach involves par- 
tial removal of lignin/hemicellulose from bulk natural wood followed 
by hot-pressing (Fig. 1a; see Methods). Natural wood contains many 
lumina (tubular channels 20-80 1m in diameter) along the wood 
growth direction (Fig. 2a—c and Extended Data Fig. 1d, e). Chemical 
treatment leads to substantial reduction of lignin/hemicellulose con- 
tent in natural wood, but only modest reduction of cellulose content, 
largely owing to the different stabilities of these three components in 
the NaOH/Na2SO3 solution (Fig. 2h). By partial removal of lignin/ 
hemicellulose from the wood cell walls, the wood becomes more porous 
and less rigid (Extended Data Fig. 1a, b). Upon hot-pressing at 100°C 
perpendicular to the wood growth direction, the wood lumina as well 
as the porous wood cell walls collapse entirely, resulting in a densified 
piece of wood reduced in thickness to about 20% (Fig. 2d) and with 
a threefold increase in density (Extended Data Fig. 1c). The densified 
wood has a unique microstructure: the fully collapsed wood cell walls 
are tightly intertwined along their cross-section (Fig. 2e and Extended 
Data Fig. 1g, j) and densely packed along their length direction 
(Fig. 2f and Extended Data Fig. 1h, i). By contrast, pure hot-pressing 
of natural wood without partial lignin/hemicellulose removal can only 
modestly densify the wood, leaving many gaps in between collapsed 
cell walls (Extended Data Fig. 2a—c). Wide-angle X-ray diffraction 
(Fig. 2g), small-angle X-ray scattering and scanning electron 
microscopy (SEM) (Fig. 2i and Extended Data Fig. 1k, l) further reveal 
that, at a finer scale, the cellulose nanofibres within the densified wood 


intertwined with each other. f, SEM image of the densified wood in the RL 
plane shows the dense laminated structure cross-section. g, Wide-angle 
X-ray diffraction pattern of the densified wood, showing that the cellulose 
nanofibre alignment is well preserved after densification. h, Chemical 
treatment leads to substantial removal of lignin (before, 20.8% + 1.2%; 
after, 11.3% + 0.5%) and hemicellulose (before, 19.5% + 0.7%; after, 

5.2% + 0.5%) in natural wood, but only modest dissolution of cellulose 
content (before, 44.0% + 1.0%; after, 38.7% + 0.8%). i, Magnified SEM 
image of the densified wood, showing the highly aligned cellulose 
nanofibres. 


remain highly aligned, similar to natural wood but much more densely 
packed. 

The mechanical properties of the densified wood are not only 
remarkably superior to those of natural wood, but also exceed those 
of many widely used structural materials (for example, plastics, steel 
and alloys). Figure 3a compares the tensile stress-strain curves for 
natural wood and densified wood. Both curves show a linear defor- 
mation behaviour before tensile failure. The densified wood demon- 
strates a record high tensile strength of 587 MPa, which is 11.5 times 
higher than that of the untreated natural wood (Fig. 3a, b), and also 
much higher than that of typical plastics?” (such as nylon 6, poly- 
carbonate, polystyrene and epoxy; Fig. 3c) and other densified woods 
(Extended Data Fig. 6m). A long-standing challenge in engineer- 
ing material design is the conflict between strength and toughness, 
because these properties are in general mutually exclusive?>”°, 
Interestingly, the large increase in tensile strength of the densified wood 
is not accompanied by a decrease in toughness. Both the work of frac- 
ture and the elastic stiffness of the densified wood are more than ten 
times higher than those of natural wood (Fig. 3b and Extended Data 
Fig. 3a). Charpy impact tests of the densified wood yield an impact 
toughness of 11.41+0.5 J cm~’, 8.3 times higher than that of the 
natural wood (1.38 + 0.3 J cm’) (Extended Data Fig. 3d). The scratch 
hardness and hardness modulus of the densified wood are 30 times and 
13 times higher than those of natural wood, respectively (Extended 
Data Fig. 3b, c, e). The flexural strength of the densified wood is about 
6 times and 18 times higher than that of natural wood along and 
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Figure 3 | Superb mechanical properties of densified wood and 
mechanistic understanding. a, Tensile stress-strain curves for natural 
wood and densified wood. b, Compared with natural wood (strength, 

46.7 + 4.5 MPa; work of fracture, 0.39 + 0.04 MJ m~%), the densified wood 
(strength, 548.8 + 47.2 MPa; work of fracture, 3.9 + 0.2 MJ m~*) has 
greatly improved strength (12 times) and work of fracture (10 times). 

c, Comparison of the tensile strength of densified wood (548.8 + 47.2 MPa) 
with other widely used polymer-based materials. d, e, SEM images of the 


perpendicular to the growth direction, respectively (Extended Data 
Fig. 3f-n). The compressive strength of the densified wood is about 
5.5 times and 33-52 times higher than that of natural wood along and 
perpendicular to the growth direction, respectively (Extended Data 
Fig. 4). We found that partial lignin removal allows for the highest 
density of the resulting wood with the best tensile strength, work of 
fracture and axial compressive strength (Extended Data Figs 2f-h and 
4j). Without lignin removal, it is difficult to hot-press natural wood into 
a completely compact wood (Extended Data Fig. 2a—-c shows numerous 
voids left between the cell walls). However, total lignin removal leads to 
wood that can be easily crushed during hot-pressing, probably owing 
to the absence of lignin as a binder (Extended Data Fig. 2d, e). The 
intrinsically light weight of cellulose also results in a specific strength 
of the densified wood (451 MPa cm~*g!) even higher than that of 
lightweight titanium alloy (about 244 MPa cm? g) (Fig. 1b)?”~*°. The 
densified wood is stable under moisture attack. For example, subjected 
to 95% relative humidity (RH) for 128h, the densified wood swells 
to produce an increase of only 8.4% in thickness, with only a modest 
drop in tensile strength (493.1 + 20.3 MPa, still 10.6 times higher 
than that of natural wood in ambient environment). Furthermore, by 
applying a standard surface treatment (painting), the densified wood 
is shown to be immune from moisture attack in the accelerated tests 
(Extended Data Fig. 5). More comprehensive studies demonstrate that 
our top-down two-step processing approach is universally effective for 
various species of wood (both hardwood and softwood) and can greatly 
enhance their strength and toughness simultaneously (Extended Data 
Fig. 6a-l). 
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tensile fracture surface of the natural wood and densified wood samples, 
respectively (RT plane). f, Simulation model of representative deformation 
and failure process in natural wood, containing a bundle of seven hollow 
wood lumina. g, Corresponding simulation model for densified wood with 
seven wood lumina fully collapsed. h, i, Simulated stress-strain curves for 
the relative sliding in the hollow lumina bundle and in collapsed lumina (h), 
which indicate a 7.5-fold increase in strength and work of fracture as a 
result of the densification treatment (i). 


A comparison between the tensile fracture surface of natural wood 
and that of the densified wood offers insights into the strengthening and 
toughening mechanisms in the densified wood. Tensile failure of natural 
wood initiates from relative sliding among open wood lumina followed 
by the pulling out and tearing of the wood lumina along the fracture 
surface (Fig. 3d and Extended Data Fig. 7a, b), while the tensile failure of 
the densified wood results from relative sliding among densely packed 
wood cell walls followed by the pulling out and fracture of the cell 
walls along the fracture surface (Fig. 3e and Extended Data Fig. 7c, d). 
Given that cellulose is the dominant constituent of the densified wood, 
the corresponding toughening and strengthening mechanisms can be 
understood as follows. The densely packed and intertwined wood cell 
walls in the densified wood at the microscale lead to a high degree 
of alignment of cellulose nanofibres and thus drastically increase the 
interfacial area among nanofibres. At the molecular scale, owing to 
the rich hydroxyl groups in cellulose molecular chains, relative sliding 
among densely packed wood cell walls involves an enormous number 
of repeating events of hydrogen-bond formation, breaking and refor- 
mation at the molecular scale® (Fig. 1a and Extended Data Fig. 1m). 
Consequently, the total energy needed to fracture the densified wood 
is much higher than that needed to fracture natural wood. In other 
words, the densified wood is much tougher than natural wood. The 
densely packed microstructure also greatly reduces both the quantity 
and size of defects (ranging from vessels to tracheids and pits on cell 
walls; Extended Data Fig. 1d-i) in the densified wood, producing a 
much higher strength than that of natural wood. Further modelling 
of the mechanics of the envisioned deformation and failure processes 
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Figure 4 | Ballistic test. a, Photographs of the natural wood, monolayer 
densified wood and the X-Y-X-Y-X laminated densified wood before 
(top row) and after (bottom row) the ballistic test. b, Comparison 

of ballistic energy absorption of three types of wood (natural wood, 

0.60 + 0.03 kJ m~!; monolayer densified wood, 4.30 +0.08kJ m7; 
X-Y-X-Y-X laminate, 6.0 +0.1kJ m~!). ¢, d, Mechanistic understanding 
of enhanced ballistic resistance in laminated densified wood. Simulation 
trajectory snapshots during the separation of two neighbouring collapsed 


in natural wood and densified wood (details in Methods) quantita- 
tively verifies the above strengthening and toughening mechanisms. It 
is shown that both the maximum nominal stress (indicating strength) 
and the energy dissipation (indicating toughness) associated with slid- 
ing between the densely packed collapsed wood lumina are about 7.5 
times higher than those associated with hollow wood lumina (Fig. 3f-i 
and Extended Data Fig. 8a-c). Hydrogen bonds formed between neigh- 
bouring cellulose nanofibres make a pivotal contribution to the remark- 
ably enhanced strength and toughness (Extended Data Fig. 8d-f). 

The well-aligned cellulose nanofibres dictate the anisotropic 
mechanical properties of densified wood (Extended Data Fig. 9a-c). 
To explore the full potential of the exceptional mechanical prop- 
erties of densified wood, we laminated two layers of natural wood 
with perpendicular wood fibre orientations, and followed the same 
processing process to obtain a bilayer densified wood (referred to as 
X-Y). Tensile strengths of the X-Y densified wood along two per- 
pendicular wood fibre directions are shown to be nearly the same 
(221.6 + 20.0 MPa and 225.6 + 18.0 MPa, respectively, Extended 
Data Fig. 9d-f), and much higher than the T-direction strength of 
monolayer densified wood (43.3 + 2.0 MPa) or that of natural wood 
(5.10.4 MPa). 

These strong and tough yet lightweight densified woods hold 
promise as materials for low-cost armour and ballistic energy 
absorption. To demonstrate such a potential, we used the same 
processing approach to make a five-layer densified wood with fibre 
orientation alternating by 90° from layer to layer (referred to as 
X-Y-X-Y-X). We performed ballistic tests (see Methods) on natural 
wood, monolayer densified wood and X-Y-X-Y-X densified wood 
in an air-gun ballistic tester (Extended Data Fig. 10a and Fig. 4a). The 
ballistic energy absorption per unit sample thickness for monolayer 
densified wood is 4.3 + 0.08 kJ m~!, a remarkable sevenfold increase 
from that of natural wood (0.6 +0.03kJ m~'). High-speed-camera 
videos of the ballistic tests (see Supplementary Video 1) and further 
characterization of the fractured samples (Extended Data Fig. 10b-g) 
reveal that in the monolayer densified wood, the perforation opening 
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wood cell walls in parallel are shown (c) along with simulation snapshots 
for the same two wood cell walls sandwiched between another two 

pairs of collapsed wood cell walls along the perpendicular direction 

at different separation displacements (d). e, The separation force as a 
function of separation displacement. The area below the curve indicates 
energy dissipation. The sandwiched configuration (corresponding to 
laminated densified wood) could dissipate more energy than the parallel 
configuration (corresponding to monolayer densified wood). 


by the steel projectile is smaller than that in the natural wood, and the 
wood surface is severely chapped, indicating much stronger bonding 
between highly packed wood cell walls (Fig. 4a). The ballistic resistance 
of the X-Y-X-Y-X densified wood is shown to be even higher and 
also more isotropic (Extended Data Fig. 10h). The projectile can break 
through the sample surface but is eventually trapped inside the sample 
without complete perforation. The resulting ballistic energy absorption 
is 6.0+0.1 kJ m™|, ten times higher than that of natural wood 
(Fig. 4b). Further mechanics modelling attributes this enhanced and 
isotropic ballistic resistance to the reinforcement effect acting between 
the neighbouring wood layers of alternating orientation (Fig. 4c—e and 
Extended Data Fig. 10i-1). 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Materials and chemicals. Basswood (Tilia), oak (Quercus), poplar (Populus), 
western red cedar (Thuja plicata) and eastern white pine (Pinus strobus) were used 
for the fabrication of densified wood. Sodium hydroxide (>97%, Sigma-Aldrich) 
and sodium sulfite (>98%, Sigma-Aldrich) and deionized (DI) water were used 
for processing the wood. 

Two-step process towards densified wood. First, natural wood blocks (typical 
sample dimension: 120.0 mm by 44.0mm by 44.0 mm) were immersed in a boil- 
ing aqueous solution of mixed 2.5 M NaOH and 0.4M Na2SO3 for 7h, followed 
by immersion in boiling deionized water several times to remove the chemicals. 
Next, the wood blocks were pressed at 100°C under a pressure of about 5 MPa 
for about 1 day to obtain the densified wood (115.6mm by 46.5 mm by 9.5mm). 
By adjusting the boiling times, densified wood with different degrees of lignin 
removal can be obtained. 

Mechanical tests. The tensile, bending and compressive properties of the wood 
samples were measured using a Tinius Olsen H5KT tester. The dimensions for 
tensile samples were approximately 100 mm by 6mm by 1.5mm. The samples 
were clamped at both ends and stretched along the sample length direction until 
they fractured with a constant test speed of 5mm min! at room temperature. The 
dimensions for bending samples were approximately 35 mm by 5mm by 4mm. 
Three-point bending tests were conducted for these samples, with the span between 
the two bottom rollers 20 mm and the top roller pressing down at the centre at a 
speed of 1mm min~!. The flexural stress is defined as the maximum tensile stress 
at the bottom surface of the sample right below the top roller. The dimensions 
for compressive samples were approximately 9 mm long, 9mm wide and 4.5mm 
thick, and the samples were compressed along the thickness direction at a speed 
of 1mm min“. 

Scratch hardness test. The scratch resistance of wood samples was evaluated 
according to the Standard Test Method for Scratch Hardness, ASTM G171- 
03(2009) using a linear reciprocating tribometer (Rtec Instruments Multi-Function 
Tribometer). The test was performed by applying a normal load on a diamond 
sphero-conical tip indenter and moving the wood surface laterally relative to the 
indenter at a constant speed. The width of the scratch was then measured using 
an optical microscope and the scratch hardness number (in gigapascals) was 
calculated as kP/w”, where P is the applied normal force, w is the scratch width 
and k is the geometric constant. Each scratch hardness value was determined as an 
arithmetic mean ofa set of three scratches made side by side at different locations. 
The lateral speed of the sample and the stroke length of the scratch were chosen as 
0.2mms !and 7mm, respectively. 

Hardness modulus test. The hardness modulus was measured using a modified 
version of the standard procedure described in the ASTM D1037-121 with an Rtec 
Instruments Multi-Function Tribometer. The standard recommends indenting 
specimens of thickness 3-6 mm, using a ball of diameter 1.3 mm, to a depth of 
2.5mm. Since our test specimens are 5mm thick, we used a smaller ball of diameter 
4.76 mm with a penetration depth of 1.05 mm, which corresponds to an average 
Hertzian contact pressure equal to that in the standard test. The rate of penetration 
was constant at the recommended value of 1.3mm min~!. The penetration force 
versus depth was plotted and the slope of the linear portion of this curve was cal- 
culated as the hardness modulus (pounds per inch). Five indentations were made 
on each specimen and the average value was reported. 

Charpy impact test. The Charpy impact test of the wood samples was performed 
ona Tinius Olsen pendulum impact tester. The dimensions of the samples were 
14mm x 4.5mm x 100mm. 

Ballistic test. We performed the ballistic tests on wood samples using a gas gun, 
which comprises a pressure indicator frame, two cylinders filled with compressed 
nitrogen (N2), a pressure chamber 127 mm in diameter and barrel-length 
190.5mm, a nozzle of length 1,156 mm and internal diameter 12.5 mm and a holder 
specifically designed to clamp the sample. The pressure indicator frame has dials 
with which we adjusted the pressure inside the two N> cylinders. The left cylinder 
is used to pressurize the volume inside the barrel chamber when the projectile is 
fired and the right N» cylinder controls the pressure for the firing valve, helping it 
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open instantaneously when fired. The chamber pressure was set to about 2.22 MPa. 
Once opened, the valve releases the pressure and accelerates the projectile. We used 
projectiles cylindrical in shape, made of stainless steel, with a diameter of 11.85 mm, 
length 51.77 mm and mass 0.046 kg. The dimensions of the samples were approxi- 
mately 44mm by 44mm by 3mm. The whole ballistic process was captured 
using two Phantom v12 cameras. The Phantom Camera Control software (http:// 
www.phantomhighspeed.com/products/accessories-and-options/camera-control- 
software), developed for such high-speed digital cameras, captured the velocities 
of the projectile before and after perforating the sample. The ballistic energy 
absorption of the test sample is defined by the kinetic energy loss after a cylindri- 
cal steel projectile perforates the sample. The equation for the ballistic energy 
absorption normalized by sample thickness is m(v; — v3)/2t, where m is the mass 
of the projectile, t is the thickness of the sample and v and 1 are the velocities of 
the projectile before and after perforating the sample, respectively. 

Accelerated test against moisture. The humidity chamber (LHS-150HC-II) was 
set up at 20°C, 95% RH. Then pre-cut samples with dimensions of approximately 
15mm by 15mm by 4.6mm were placed in the humidity chamber. The dimen- 
sions and weight of the samples after various intervals in the humidity chamber 
were recorded. Following a painting method widely used in the wood industry, 
we coated the densified wood with a thin layer of oil-based paint (Polyurethane, 
Minwax). After the paint was totally dry, the sample was put into the humidity 
chamber and measured at regular intervals. 

Characterizations. A scanning electron microscope (SEM, Hitachi SU-70) was 
used to characterize the morphologies of the wood samples. Small-angle X-ray 
scattering (SAXS) patterns for three samples of each wood were collected using 
Rigaku MicroMax 007HF (operating voltage at 40 kV, current at 30 mA, CuKa, 
A=0.1541 nm). The angle between the incident X-ray beam and the width direc- 
tion on the sample was kept at 90°. The raw azimuthal intensity distribution was 
extracted and the baseline is subtracted. Wide-angle X-ray diffraction patterns 
were collected on multi-filament bundles using a Rigaku RAPID II (operating 
voltage at 40 kV, current at 30 mA, CuKa, \=0.1541 nm) equipped with a curved 
detector manufactured by Rigaku Americas Corporation. Compositional analysis 
of natural wood and chemical-treated wood was carried out on a high-performance 
liquid chromatography (HPLC) system (Ultimate 3000, Thermo Scientific, USA). 
Mechanics modelling. We used a generic coarse-grained simulation scheme to 
qualitatively reveal the underlying mechanism for the enhancement in mechanical 
properties. The wood fibre is modelled as a tube made of coarse-grained beads that 
assume a hexagonal lattice structure (Extended Data Fig. 8d). The bonded energy 
terms of the coarse-grained scheme consist of a two-body bond energy and three- 
body angle energy and a four-body torsion energy as follows: 


1 1 
Ubonded(Tij Fijk) = » 7 Kona (rg—roy + » ae (cos;jx — cosO)? 


+ z A,,cos"!6 


n=1,5 


where Kona and Ky are the bond force constant and the angle force constant, 


respectively, rj is the distance between the ith and jth coarse-grained beads while 
ro is the corresponding equilibrium value of rj; 0; is the angle formed between the 
i-j bond and the j-k bond while 4) is the corresponding equilibrium value of Ox 
(which is 120° for all cases). A, are coefficients (n= 1, 2, 3, 4, 5) for the dihedral 
angle ®. The non-bonded term includes the long-range van der Waals Lennard— 
Jones-type interaction 4e[(a/r)!”— (a/r)°] between coarse-grained beads (cut-off 
distance 1 nm, ¢ denotes the interaction strength and o denotes the distance where 
the interaction energy crosses zero) and a short-range (cut-off distance 0.24nm) 
Morse-type potential Do[e~20— ™os« 0) — 2e~0("—rmors 0], which is used to model 
the hydrogen-bond interaction among wood fibres. The simulation is done at 300 K 
by canonical ensemble and by Nose-Hoover thermostat. Extended Data Fig. 8g 
lists the values of the coarse-grained parameters used in the simulations. 

Data availability. The data that support the findings of this study are available 
from the corresponding authors on reasonable request. 
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Extended Data Figure 1 | Structural characterization of natural wood 
and densified wood. a, b, Comparison of SEM images of natural wood (a) 
and the wood after partial lignin removal but without lateral hot- 

pressing (b) shows that lignin between the cell walls is partially removed. 
c, Comparison of densities of natural (0.43 + 0.02 g cm~*) and densified 
woods (1.30 + 0.02 g cm ~*). d-f, SEM images of the cross-section of 
natural wood in the RT (d) and TL (e, f) planes show intrinsic defects such 
as vessels and tracheids along the L direction and pits in the cell walls. 

g-j, The corresponding SEM images of densified wood show that the 


Relative sliding of cellulose molecular chains Hydrogen bonding 
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hollow lumina are completely collapsed to form highly intertwined wood 
cell walls (g), as verified by the simulation model (j), and even the tiny pits 
in the wood cell walls are eliminated owing to the densification (h, i). 
k-m, The small-angle X-ray scattering pattern (k) and the high- 
magnification SEM image (1) show well-aligned cellulose nanofibres 

in densified wood, which greatly facilitate the formation of hydrogen 
bonds in neighbouring cellulose molecular chains during their relative 
sliding (m). 
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— DW-23.6% —— DW-27% 
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2 = 3004 
@ 1.0; 2 
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0 T T T 
0.0 0.3 0.6 0.9 1.2 1.5 
Strain (%) 
h 
Samplename Cellulose content Hemicellulose Lignin content Strength Work of fracture Density 
content (%) (%) (MPa) (MJ/m*) (gicm’) 
NW 44.01 19.5 20.8 51.6 0.43 0.46 
DW-0% 44.01 19.5 20.8 175.0 1.1 1.04 
DW-23.6% 42.2 10.6 15.9 325.6 1.6 1.13 
DW-27% 40.2 9.2 15.1 386.3 ya) 1.20 
DW-30% 38.2 7.2 14.7 425.6 23 1.23 
DW-32.5% 38.1 6.8 14.3 488.8 23 1.25 
DW-~45% 38.7 5.2 11.3 586.8 4.0 1.3 
DW-60% 35.4 3.8 8.2 319.0 1.48 1.15 
DW-100% 31.2 1.89 0.13 12.5 0.02 1.06 


Extended Data Figure 2 | Effect of degree of lignin removal on wood 
structure and mechanical properties. a, Schematics of wood sample with 
the L direction as the tree-growth direction. b, c, SEM images of the cross- 
sections in the RT plane (b) and the RL plane (c) of a pressed wood sample 
with 0% lignin removal, which show a large number of gaps remaining in 
between partially collapsed cell walls. d, e, Photo and SEM image of the 
densified wood with 100% lignin removal show that the pressed cell walls 
are separated from each other owing to the absence of lignin as binding 


agent. f, g, Densities (f) and tensile stress-strain curves (g) of densified 
woods with various degrees of lignin removal. h, Summary of cellulose/ 
hemicellulose/lignin contents as well as strength, work of fracture and 
density under various degrees of lignin removal. Densified wood with 45% 
lignin removal is shown to have the highest strength, work of fracture and 
density. DW-x refers to densified wood with a certain amount (x) of lignin 
removal and subsequent densification, whereas NW refers to natural wood 
without lignin removal or densification. 
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Extended Data Figure 3 | Comparison of mechanical properties along three different directions. g, j, m, Corresponding flexural stress 
of natural wood and densified wood. a, Stiffness (natural wood, as a function of roller displacement (bending deflection) for natural 
4.8 + 0.9 GPa; densified wood, 51.6 + 1.5 GPa). b, Scratch hardness wood and densified wood. h, k, n, Comparison of the corresponding 
(natural wood, 0.02 + 0.0029 GPa; densified wood, 0.6 + 0.025 GPa). flexural strengths of natural wood (with the roller along the T direction, 
c, Interferometer images of scratches on natural wood and densified wood, 54.3 45.1 MPa; perpendicular to wood growth direction, 4.4 + 0.9 MPa; 
showing the notable decrease of the scratch depth of the densified wood with the roller along the R direction, 42.6 + 4.9 MPa; eight samples 
owing to increased hardness. d, Charpy impact toughness (densified tested for each direction) and densified wood (with the roller along the 
wood, 11.41 +0.5J cm~?; natural wood, 1.38 +0.3J cm~’). e, Hardness T direction, 336.8 + 11.3 MPa; perpendicular to wood growth direction, 
modulus (natural wood, 740.1 + 115.4 pounds per inch; densified wood, 79.5 + 3.0 MPa; with the roller along the R direction, 315.3 + 14.8 MPa; 
9454.5 + 273.3 pounds per inch). f, i, 1, Schematics of bending tests eight samples tested for each direction). 
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Extended Data Figure 4 | Compressive strength of natural wood 
and densified wood. a, d, g, Schematics of compression tests along 
three different directions. b, e, h, Corresponding compressive stress 

as a function of compressive displacement for natural wood and 
densified wood. c, f, i, Comparison of the corresponding compressive 
strengths of natural wood (L direction, 29.6 + 2.0 MPa; R direction, 
3.9 + 0.6 MPa; T direction, 2.6 + 0.4 MPa; eight samples tested for each 
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direction) and densified wood (L direction, 163.6 + 4.1 MPa; R direction, 
203.8 + 5.2 MPa; T direction, 87.6 + 3.0 MPa; eight samples tested for 
each direction). j, Comparison of axial compressive strengths (along 

the L direction) of natural wood, delignified wood without hot-pressing, 
pressed natural wood without delignification, and densified wood 
(delignified and then hot-pressed). Insets illustrate the representative 
cross-section features of the four types of wood. 
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Extended Data Figure 5 | Dimensional stability and mechanical 
properties of pressed natural wood, densified wood and surface- 


painted densified wood against moisture. a, b, Photographs of pressed 
natural wood without delignification, densified wood (45% lignin removal 
and surface-painted densified wood before (a) and 
after (b) sustaining 95% RH for 128h. c, Change in thickness of the three 


and then hot-pressed 


wood samples over time. d, Percentage increase in thickness (pressed 


natural wood, 43.1% + 1.4%; densified wood, 8.4% + 0.9%; surface-painted 


— : 
08 10 12 14 


b After exposure to 95% RH for 128 hrs 


|. Pressed natural wood (without delignification, DW-0%) 


ll. Densified wood (DW-45%) 
Ill. Surface painted densified wood 


Qa 


95% RH 
il 128 hrs 
= 40-4 
Do 
£ 
2 304 
7) 
a 
3 
cs 204 
p*4 
2 
= 
104 
0 
0 —— ee —| 
Pressed Densified Surface painted 
natural wood wood densified wood 
f 900 
R22 Before exposure || lll After exposure to 
800 ' =. 
| Teeoq ©. 95% RH || gy 95% RH, 128hrs 
700 5 
© 6004 541.7 535.9 
2 at RRR 
5 eee 
2 400- seen 
o Bee 
® 3004 ss eee 
Pressed Densified Surface painted 
natural wood wood densified wood 


densified wood, 0%). e, Tensile stress-strain curves of the three wood 
samples after sustaining 95% RH for 128h. f, Strengths of the three wood 


samples before (pressed natural wood, 161.5 4 


t 18.8 MPa; densified wood, 


548.8 + 47.2 MPa; surface-painted densified wood, 541.7 + 29.2 MPa) 
and after (pressed natural wood, 98.2 + 12.6 MPa; densified wood, 
493.1 + 20.3 MPa; surface-painted densified wood, 535.9 + 30.0 MPa) 


sustaining 95% RH for 128h. 


© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


Extended Data Figure 6 | Superb mechanical properties of various species 
of densified wood. Comparison of the stress-strain curve, tensile strength 
and work of fracture for natural and densified woods of various species: 
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the hardwoods oak and poplar, and the softwoods cedar and pine. 


a-c, Oak (natural wood strength, 115.3 4 
584.3 + 29.8 MPa; natural wood work of fracture, 1.844 
densified wood work of fracture, 5.3 4 
wood strength, 55.6 + 8.0 MPa; densified wood strength, 431.5 4 
natural wood work of fracture, 0.48 4 


t 10.2 MPa; densified wood strength, 
+0.1MJm~?; 
+ 0.2 MJ m~°). d-f, Poplar (natural 


t 15.1 MPa; 
+ 0.05 MJ m~?; densified wood 


3.34 


work of fracture, 3.0 4 
46.5 + 5.4 MPa; densified wood strength: 550.1 4 
+ 0.06 MJ m~%; densified wood work of fracture, 
0.08 MJ m~). j-1, Pine (natural wood strength, 70.2 + 10.0 MPa; 
t 24.7 MPa; natural wood work of fracture, 


work of fracture: 0.35 4 


densified wood strength, 536.9 4 
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+ 0.1 MJ m~%). g-i, Cedar (natural wood strength: 
t 47.4 MPa; natural wood 


0.58 £0.07 MJ m~?; densified wood work of fracture, 3.03 + 0.33 MJ m~°). 


m, Comparison of the structural and mechanical properties of the densified 
wood in this study and other previously reported’*"? densified wood 


materials made from different species of natural wood. 
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Extended Data Figure 7 | Fracture surface (RL plane) of natural wood fracture surface in the RL plane. d, SEM image of the fracture surface of 
and densified wood. a, c, The schematics of the natural wood and the densified wood in RL plane showing the pulling out and fracture of 
densified wood. b, SEM image of the fracture surface of the natural wood wood fibres from the densely packed cell walls. 


showing the pulling out and tearing of the hollow wood lumina along the 
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Extended Data Figure 8 | Simulation model for natural wood and e, The corresponding resistant forces with hydrogen bonding turned on 
densified wood. a, To obtain the compressed morphology of natural and turned off (that is, voiding the Morse potential in the simulation force 
hollow wood fibres in simulations, we apply the reflective wall boundary field) are calculated as a function of sliding displacement, respectively, 
condition and then gradually shrink one dimension of the simulation box showing that the hydrogen bonding would increase the resistant force 
so that the bundle is compressed laterally. b, c, Morphological view of by about ten times. f, The initial configuration of the seven-lumina 
uncollapsed (b) and collapsed (c) wood-fibre bundles during pulling. bundle model used in the main text. These lumina each have a diameter 
d-f, Effect of hydrogen bonding (HB). d, Simulation model to demonstrate —_ of 6.26 nm and length of 8.95 nm. g, The values of the coarse-grained 
the effect of hydrogen bonding. Two wood fibres slide along each other. parameters used in the simulations. 
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Extended Data Figure 9 | Comparison of tensile properties of X-Y 
stacking densified wood and monolayer densified wood. a-c, Tensile 
properties of the natural wood and monolayer densified wood along the 
T direction: a, illustration of tensile direction, b, tensile stress-strain 
curves and ¢, tensile strengths along the T direction (natural wood, 
5.1+0.4 MPa; densified wood, 43.3 + 2.0 MPa). d-f, Tensile properties 
of the X-Y stacking densified wood: d, illustration of the X-Y stacking 
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densified wood and two perpendicular tensile directions, e, tensile 
stress-strain curves and f, the tensile strengths of the X-Y stacking 
densified wood along directions 1 and 2 are nearly the same 
(221.6 + 20.0 MPa and 225.6 + 18.0 MPa, respectively), much higher 
than that of natural wood and that of monolayer densified wood in the 
T direction. 
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Extended Data Figure 10 | Ballistic test. a, Schematics of the air-gun h, Ballistic energy absorption of the monolayer densified wood 
ballistic tester. b, Photograph of natural wood after ballistic test, showing (Y, 2.50.1 kJ m7}; X, 4.3 + 0.08kJ m7!) and laminated densified 
relatively smooth wood surface after the projectile perforates the wood. wood (X-Y-X-Y-X laminate: 5.6 + 0.2kJ m~!; X-Y-X-Y-X laminate: 
c, d, SEM images of the fracture surface show that fracture takes place 6.0+0.1kJ m~') from both directions (X, fibre alignment direction; 
along the loosely bonded cell walls in natural wood. e, Photograph of Y, perpendicular to fibre alignment direction). The insets show the 
monolayer densified wood after ballistic test, showing severely chapped schematics of the sample and holder. i-I, Simulation model used in 
wood surface after the projectile perforatesthe wood. f, g, SEM images Fig. 4c, d. i, j, End view and top view of the parallel wood fibre model, 
of the fracture surface show enormous numbers of wood fibres pulled respectively. k, 1], End view and top view of the sandwiched wood fibre 
out from the densely packed cell walls, suggesting substantial energy model, respectively. These wood fibres (before being collapsed) have a 
dissipation during the projectile perforating the densified wood. diameter of 2.35 nm and a length of 15.34nm. 
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Limited emission reductions from fuel subsidy 
removal except in energy- exporting regions 


Jessica Jewell?, David McCollum!*, Johannes Emmerling*, Christoph Bertram®, David E. H. J. Gernaat”"®, Volker Krey!, 


Leonidas Paroussos’, Loic Berger*°!°, Kostas Fragkiadakis’, Ilkka Keppo!, Nawfal Saadi!!, Massimo Tavoni 


Detlef van Vuuren”*, Vadim Vinichenko & Keywan Riahi!!* 


Hopes are high that removing fossil fuel subsidies could help 
to mitigate climate change by discouraging inefficient energy 
consumption and levelling the playing field for renewable energy! >. 
In September 2016, the G20 countries re-affirmed their 2009 
commitment (at the G20 Leaders’ Summit) to phase out fossil fuel 
subsidies*° and many national governments are using today’s low 
oil prices as an opportunity to do so®°. In practical terms, this 
means abandoning policies that decrease the price of fossil fuels 
and electricity generated from fossil fuels to below normal market 
prices!®1!, However, whether the removal of subsidies, even if 
implemented worldwide, would have a large impact on climate 
change mitigation has not been systematically explored. Here we 
show that removing fossil fuel subsidies would have an unexpectedly 
small impact on global energy demand and carbon dioxide emissions 
and would not increase renewable energy use by 2030. Subsidy 
removal would reduce the carbon price necessary to stabilize 
greenhouse gas concentration at 550 parts per million by only 2-12 
per cent under low oil prices. Removing subsidies in most regions 
would deliver smaller emission reductions than the Paris Agreement 
(2015) climate pledges and in some regions global subsidy removal 
may actually lead to an increase in emissions, owing to either coal 
replacing subsidized oil and natural gas or natural-gas use shifting 
from subsidizing, energy-exporting regions to non-subsidizing, 
importing regions. Our results show that subsidy removal would 
result in the largest CO, emission reductions in high-income 
oil- and gas-exporting regions, where the reductions would exceed 
the climate pledges of these regions and where subsidy removal 
would affect fewer people living below the poverty line than in lower- 
income regions. 

Fossil fuel subsidies amounted to about $330 billion (referring to 
the US dollar in 2005, throughout) worldwide in 2015 after having 
reached about $570 billion in 2013. This fall in subsidies could be partly 
a sign of reform or simply a reflection of today’s lower oil prices, given 
that historically subsidies have followed the oil price!’ (Supplementary 
Fig. 1). It is therefore too early to say whether subsidies will continue to 
fall, stabilize or increase if oil prices rise again. Earlier work found that 
global subsidy removal by 2020 would reduce greenhouse gas emissions 
by 5% (ref. 12) to 6% (ref. 13) by 2035 and by 6% (ref. 12) to 8% 
(refs 14, 15) by 2050. However, all of these studies were done using a 
single model and none of them explored variations in the oil price, 
which greatly affects the size of subsidies. 

We used five Integrated Assessment Models (IAMs) to evaluate 
the global and regional effects of removing fossil fuel subsidies on 


4,5,12 
’ 


emissions, the energy mix and energy demand under both low and high 
oil prices. In the high-oil-price scenarios, oil prices exceed $100 per 
barrel and in the low-oil-price scenarios they drop below $60 per barrel by 
2020 (Fig. 1). 

The IAMs we use vary in their modelling approaches and solution 
mechanisms (Supplementary Table 1, Supplementary Information 
sections 1, 2), which improves the robustness of the results in the face of 
structural model uncertainties. They include four technology-detailed 
energy-economy models and one multi-sectoral computable general 
equilibrium model. An important difference across models, which 
affects the modelled effects of subsidy removal, is the responsiveness of 
energy supply and demand to changes in energy prices (Supplementary 
Tables 1, 2, Supplementary Information sections 2, 3). 
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Figure 1 | Modelled high- and low-oil-price scenarios. Historical prices 
represent crude oil prices from ref. 28 and are shown through to the end of 
2015. Modelled prices start in 2020. 
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Figure 2 | Current and projected fossil fuel subsidies without reform. 
a, Global subsidies in 2013 (high oil prices), in 2015 (low oil prices), 

and in 2030 under high and low oil prices projected in different models. 
b, The regional distribution of subsidies in 2015 (see also Supplementary 
Table 5). c, Subsidies in 2013 and 2015 (Supplementary Table 5) and in 


We follow the International Energy Agency (IEA) and the 
Organisation for Economic Co-operation and Development 
(OECD) definition of fossil fuel subsidies as government support 
of the consumption or production of oil, gas or coal that lowers 
their prices below normal market prices (Methods). This definition 
excludes un-priced environmental and social externalities such as 
air pollution and related health effects, which are included in some 
other estimations!® but are not appropriate for the purpose of this 
paper (Methods). We compiled a global comprehensive dataset of 
fossil fuel subsidies®!%'1'”!8 under both high and low oil prices 
(Supplementary Tables 3, 4, Supplementary Information sections 4, 5). 
In 2013, when oil prices were relatively high, subsidies amounted 
to approximately $570 billion (Supplementary Table 5), including 
$340 billion for oil, $110 billion each for natural gas and electri- 
city, and $5 billion for coal (Fig. 2). Only $22 billion (less than 4%) 
were production subsidies (Supplementary Table 3). Following the 
decline in oil prices, subsidies fell to about $330 billion in 2015, 
which amounted to about 10% of energy-related market transactions 
(Supplementary Table 6). 


230 | NATURE | VOL 554 | 8 FEBRUARY 2018 


2030, Low oil prices 


b 
Pacific OECD North America 
Africa Middle East 
Rest of Asia 
Latin America 
Russia* 
China* Rest of Asia 


% 6 
N N 

& & 2030 
300, | 


Subsidies (billion US$) 
8 
3S 


100 
. 0) 5 
High Low High Low 
Pacific OECD 
> % © 
e NEN 
B & & 2030 
300 =i. 2 
= 
S 
5 200 
8 100 
ne} 
£0 
D High Low 
Middle East Indiat 
% © % © 
NaN = NEN 
& & 2030 8 & & 2080 
5 
= 
S 
3 
n 
oO 
Ke) 
an 
Qa 
High Low High Low 


2030 under high and low oil prices in each region (model median). 
For model ranges and additional years see Supplementary Tables 5, 7 
and 8. The map presents a stylistic representation of regions. 

For regional definitions see Supplementary Tables 9-14. 


In our scenarios, we model subsidy rates in a way consistent with 
historical patterns (Methods). Under high oil prices, by 2030, global 
subsidies would grow to between $750 and $970 billion; under low oil 
prices, subsidies would be between $550 and $700 billion through to the 
end of 2030 (Supplementary Table 5). In the subsidy removal scenarios, 
their phase-out starts in 2020 and is completed by 2030. 

The three oil- and gas-exporting regions, the Middle East and 
North Africa (MENA), Russia* (the ‘+’ superscript is used to refer 
to regions that constitute more than only the named country— 
see Supplementary Table 9 for region definitions) and Latin 
America accounted for about two-thirds of all fossil fuel subsi- 
dies worldwide in 2015 (Fig. 2). In Latin America and MENA, 
about half of total subsidies goes to oil. In Russia‘, about half of 
total subsidies goes to natural gas and the remainder to electricity 
(mostly generated from natural gas). Of these three regions, subsidy 
expenditures would grow the most in MENA, which would experience 
the largest growth in energy use (Fig. 2). 

Developing and emerging economies (India‘, Rest of Asia, Africa 
and China’) currently have lower subsidies than the oil and gas 
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Figure 3 | Global and regional impact of subsidy removal and NDCs 
on CO; emissions from fossil fuels and industry under low oil prices. 
a, The impact of subsidy removal on global annual emissions compared 
to each model’s baseline. b, The impact of subsidy removal on cumulative 
change in emissions from 2020 to 2030 at the regional level (coloured 
bars). Solid lines represent emission effects of unconditional NDCs and 
dashed lines of conditional NDCs—both modelled in MESSAGE”. 

The uncertainty ranges for these effects arise from different historical 
emission inventories, alternative accounting, attribution of non- 
commercial biomass and uncertainties in the formulations of NDCs 
(Supplementary Methods, Supplementary Table 15; ref. 29). See 
Supplementary Fig. 6 for high-oil-price scenarios and Supplementary 
Fig. 5 for global relative changes and regional absolute changes. 


exporters, but their subsidies may grow faster in the future (Fig. 2). 
Without reform, subsidies in India under high oil prices could become 
comparable to those in Latin America and Russia* by 2030. In these 
regions, over half of all subsidies goes to oil, for example, through 
depressed road fuel prices (in countries in the Rest of Asia region), 
tax breaks on road fuels (in China), or kerosene subsidies (in India 
and Africa). 

Subsidies in the developed regions (Europe, North America and the 
Pacific OECD) accounted for about 13% of subsidies worldwide in 
2015. These are not projected to grow very much in the future. 

Subsidy removal would lead to a small decrease in global COz 
emissions: 0.5-2 gigatons of carbon dioxide (Gt CO) or 1%-4% 
by 2030 under both low (Fig. 3, Supplementary Fig. 5) and high 
(Supplementary Figs 5, 6) oil prices. This is much less than the 
Nationally Determined Contributions (NDCs) from the Paris 
Agreement, which add up to a decrease of between 4-8 Gt from fossil 
fuels and industry. Subsidy removal would reduce the average global 
carbon price in 2020-2050 that would be required to achieve modest 
climate goals (an atmospheric concentration target of 550 parts per 
million CO equivalent by 2100 or a probable 2-2.3 °C temperature 
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Figure 4 | Change in supply of different fuels resulting from subsidy 
removal in 2030 in four regions under low oil prices. MENA and Russiat 
illustrate exporting regions, India‘ illustrates developing importing 
regions, Europe illustrates developed regions (Supplementary Fig. 10 
shows the other six regions). Positive values of ‘Net change’ indicate a 
decrease in the total primary energy supply; negative values indicate an 
increase. Supplementary Figs 11 and 12 show results under high oil prices. 
The regional definitions (Supplementary Tables 9-14) can influence the 
size of energy system changes. 


increase in 2100!) by an average of 2%-12% or by $0.7-$2.1 per ton 
of CO? under low oil prices (Supplementary Information section 6, 
Supplementary Tables 16, 17). 

Even though the oil price has an impact on the absolute level of 
subsidies, it does not greatly affect the impact of subsidy removal on 
emissions because the latter depends on the ratio between subsidies and 
energy prices, which is similar in the low- and high-oil-price scenarios. 
Figures 3 and 4 illustrate the low-oil-price scenarios; the high-oil-price 
scenarios are illustrated in Supplementary Information and described 
in the text wherever they are very different. 

The impacts of subsidy removal are very different in two groups of 
regions. In oil- and gas-exporting regions (MENA, Russia* and Latin 
America), subsidy removal leads to the largest emission reductions, 
equivalent to or greater than their relatively modest NDCs. In all other 
regions, emission reductions from subsidy removal are generally less 
than their NDCs (Fig. 3, Supplementary Fig. 6). 

In Russiat, where most subsidies are for natural gas (including 
electricity generation), subsidy removal would reduce the use of 
natural gas and generally lead to higher emission reductions than the 
modest NDCs. In MENA and Latin America, subsidy removal would 
decrease the use of oil and natural gas leading to emission reductions 
that are generally comparable to the so-called ‘conditional’ NDCs 
(that is, commitments dependent on international action) but gener- 
ally larger than the unconditional NDCs. 

Developing and emerging economies that are not major oil and 
gas exporters would generally experience smaller emission impacts 
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(both in absolute terms and in relation to their NDCs) from subsidy 
removal owing to their lower subsidy levels. The main effect of subsidy 
removal in Indiat would be reduced use of oil and natural gas, and in 
the Rest of Asia would be slightly reduced use of coal and oil. In both 
regions, the decline in emissions would be generally smaller than the 
NDCs. In China* subsidies are lower and the impact of their removal 
would also be small in comparison with the NDCs. In Africa, subsidy 
removal would also have a much smaller effect than the NDCs (and in 
one model would even lead to an increase in emissions owing to the 
substitution of oil for coal). 

In the three developed regions (Europe, North America and the 
Pacific OECD) with low subsidies, the main impact of global subsidy 
removal is driven by the change of the price of fossil fuels on the global 
market. As oil and gas exporters reduce domestic demand by removing 
subsidies, they make more resources available for the global market. 
This can, for example, lead to increased use of natural gas in Europe 
(Fig. 4). This effect is more pronounced in models with more flexible 
energy trade. The resulting change in emissions can either be negative 
or positive depending on whether the cheaper natural gas substi- 
tutes oil and coal or leads to an increase of consumption. All in all, 
subsidy removal would lead to much smaller emission reductions than 
the NDCs. 

Although the above results are robust for all models, there are certain 
variations, due to different features and assumptions of particular 
models. The most notable difference is that in some regions, subsidy 
removal can unexpectedly lead to an increase in emissions. In India* 
(the MESSAGE model) and Africa (the REMIND model) this occurs 
because these models assume more flexibility in fuel substitution. 
As a result, removing subsidies leads to substitution of oil or natural 
gas with more carbon-intensive coal, producing either an increase in 
emissions or smaller reductions of emissions. In addition, REMIND 
assumes the most flexible international energy trade, which means 
that energy-importing regions (Europe, the Pacific OECD and North 
America) increase use of natural gas (and therefore greenhouse gas 
emissions; Fig. 3) after it stops being subsidized in energy-exporting 
regions. Other less notable differences are discussed in Supplementary 
Information section 2. 

Our results show that removing fossil fuel subsidies would lower 
global energy demand. The decrease in energy demand is caused by 
increasing energy prices and ranges between 5 EJ and 26 EJ per year or 
1%-4% in 2030 (Supplementary Figs 7, 8). Under high oil prices, the 
decrease in demand is larger, reaching up to 30 EJ per year or 7% in 
2030 (Supplementary Figs 7, 8). The decrease in demand is largest in 
oil- and gas-exporting regions (MENA, Russia* and Latin America), 
whereas in some energy-importing regions energy use could even 
increase following subsidy removal owing to the larger availability of 
natural gas on international markets (as discussed above). 

In addition, removing fossil fuel subsidies would not strongly 
stimulate the growth of renewable energy by 2030 (Fig. 4). In gen- 
eral, removing fossil fuel subsidies leads to an increase in the share of 
renewables in regional energy mixes of less than two percentage points 
(Supplementary Fig. 13). A slightly larger increase may occur under 
high oil prices in bioenergy in Russia*, MENA and Latin America 
or solar energy in MENA and Russia* (Supplementary Figs 10-12). 
Beyond 2030, subsidy removal could stimulate more noticeable growth 
of renewable energy, in particular bioenergy under certain modelling 
assumptions. 

A more pronounced effect of fossil subsidy removal is the switch 
from one fossil fuel to another, for example from subsidized natural 
gas and oil to coal in MENA, Russia* and India‘ as well as from coal 
and oil to natural gas in Europe (Fig. 4), which highlights the need to 
consider the systemic effects of subsidy reform policies. The switch 
between fossil fuels is more pronounced in models with higher 
flexibility of supply and lower flexibility of demand as well as higher 
flexibility of international trade (Supplementary Information section 2). 
Another, more granular effect is the slowdown of the switch from solid 
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fuels (such as coal and firewood) to natural gas and kerosene among 
the poor, as shown by IMAGE (a model representing different income 
groups; see Supplementary Fig. 9). This is in line with earlier findings 
that as modern fuels become more expensive, lower-income groups are 
unable to avoid traditional fuels, unless supportive policies are imple- 
mented in parallel”°”'. 

We tested the sensitivity of our findings against baseline assumptions 
(Supplementary Information section 7, Supplementary Figs 14-17), 
decoupling of the oil and gas prices (Supplementary Information 
section 8, Supplementary Figs 18-21), and the assumption of higher 
production subsidies’””> (Supplementary Information section 9, 
Supplementary Table 18, Supplementary Figs 22-25). The emissions 
and energy systems impacts are generally robust across these uncer- 
tainties but changing socio-economic baseline assumptions changes 
the projected emission reductions from some regional NDCs, which 
in turn changes the relationship between the NDCs and the effects of 
subsidy removal (Supplementary Information section 7). 

Our finding that subsidy removal would have the largest impact 
on CO; emissions in Russiat, MENA and Latin America is especially 
meaningful when we consider two features of the political economy 
of subsidies. The first is that subsidy removal could disproportion- 
ately harm the poor in some countries”*”°. The second is that today’s 
low oil prices pressure energy-exporting states to reduce spending as 
government revenues shrink™*. This provides a unique political oppor- 
tunity to remove subsidies precisely where it would have the largest 
effect on emissions and affect a comparatively small number of people 
living below $3.10 per day (Supplementary Table 19, Supplementary 
Information section 10). Conversely, in low-income regions, subsidy 
removal would lead to smaller emission reductions and probably 
affect more people living below the poverty line. The frequently voiced 
suggestion of coupling subsidy removal with other emission-reduction 
policies such as carbon pricing!*"* or clean energy support schemes”®”” 
would not necessarily reduce the impact of subsidy removal on the poor 
unless such policies are specifically designed to do so. 


Data Availability All data for the subsidy scenarios and sensitivities are available 
at https://tntcat.iiasa.ac.at/ADVANCEWP3DB. The NDC data used in this paper 
are from ref. 29 and are available on request. The sources and compilation 
method for the input data on subsidies and prices are described in detail in 
Supplementary Methods subsection ‘Energy price and subsidy data’. 
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Evolutionary history of the angiosperm flora 


of China 


Li-Min Lu!*, Ling-Feng Mao?*, Tuo Yang", Jian-Fei Ye!*+*, Bing Liu!>*, Hong-Lei Li®’*, Miao Sun®*, Joseph T. Miller!" 
Sarah Mathews!°, Hai-Hua Hu!, Yan-Ting Niu!?, Dan-Xiao Peng!, You-Hua Chen”, Stephen A. Smith!’, Min Chen"*, 
Kun-Li Xiang!’, Chi-Toan Le)’, Viet-Cuong Dang!?, An-Ming Lu!, Pamela S. Soltis’§, Douglas E. Soltis®°§, Jian-Hua Lis & 


Zhi-Duan Chen!°s 


High species diversity may result from recent rapid speciation in 
a cradle’ and/or the gradual accumulation and preservation of 
species over time in a ‘museum’! ”. China harbours nearly 10% of 
angiosperm species worldwide and has long been considered as both 
a museum, owing to the presence of many species with hypothesized 
ancient origins*“, and a cradle, as many lineages have originated 
as recent topographic changes and climatic shifts—such as the 
formation of the Qinghai-Tibetan Plateau and the development of 
the monsoon—provided new habitats that promoted remarkable 
radiation°®. However, no detailed phylogenetic study has addressed 
when and how the major components of the Chinese angiosperm 
flora assembled to form the present-day vegetation. Here we 
investigate the spatio-temporal divergence patterns of the Chinese 
flora using a dated phylogeny of 92% of the angiosperm genera for 
the region, a nearly complete species-level tree comprising 26,978 
species and detailed spatial distribution data. We found that 66% of 
the angiosperm genera in China did not originate until early in the 
Miocene epoch (23 million years ago (Mya)). The flora of eastern 
China bears a signature of older divergence (mean divergence 
times of 22.04-25.39 Mya), phylogenetic overdispersion (spatial 
co-occurrence of distant relatives) and higher phylogenetic diversity. 
In western China, the flora shows more recent divergence (mean 
divergence times of 15.29-18.86 Mya), pronounced phylogenetic 
clustering (co-occurrence of close relatives) and lower phylogenetic 
diversity. Analyses of species-level phylogenetic diversity using 
simulated branch lengths yielded results similar to genus-level 
patterns. Our analyses indicate that eastern China represents a 
floristic museum, and western China an evolutionary cradle, for 
herbaceous genera; eastern China has served as both a museum 
and a cradle for woody genera. These results identify areas of 
high species richness and phylogenetic diversity, and provide a 
foundation on which to build conservation efforts in China. 
Species composition within a geographic area is the result of 
historical processes including speciation, extinction, migration® and 
ongoing ecological interactions. The extent to which each process has 
contributed to spatial and temporal patterns of biodiversity, as well 
as community assembly, varies across the landscape. The biodiversity 
patterns within a region may result from a recent increase in the rate 
of speciation that has generated a cradle of biodiversity. Alternatively, 
biodiversity may derive from the presence of numerous surviving 


ancient lineages, together forming a museum region. The process 
of speciation and the maintenance of ancient lineages need not be 
mutually exclusive, and some regions have features of both cradles 
and museums. 

The evolutionary history of regional floras has typically been 
addressed using specific taxa as exemplars’~® or by examining the 
entire flora at various taxonomic levels'*"'”. These investigations 
provide insights into historical factors, including geological history, 
climatic shifts and evolutionary processes, that might have contributed 
to modern geospatial patterns of biodiversity'*'*. Concomitantly, 
these studies lay the foundation for decision-making in conserving 
biodiversity. However, few studies have explored the biodiversity 
patterns ofa large region incorporating dated phylogenies and detailed 
distribution data. 

China, which is home to 30,000 of the approximately 350,000- 
400,000 species of vascular plants’, is ideal for investigating patterns 
of biodiversity because of its large size, range of habitats, considerable 
biological diversity and heterogeneous physical geography. Whether 
areas within China serve as cradles or museums remains unclear, as 
floristic components of putative ancient origin?“ and of recent diversi- 
fication® have both been discovered. It has previously been suggested'® 
on the basis of comparisons between the taxonomic richness of vascular 
plants in China and the United States, that the greater species diversity 
in China reflects the region’s complex topography and long connections 
with tropical South-East Asia. On the basis of patterns in species rich- 
ness (using 555 endemic seed plant species), mountainous regions of 
central and southern China have been identified as the main centres 
of plant endemism’”. Previous studies have attributed most of the 
geographic variation in species richness of woody plants in China to 
temperature seasonality'® and the extent of winter cold!’. Notably, to 
our knowledge, no previous study has incorporated both phylogenetic 
and spatial components to address the evolutionary history of the 
Chinese flora. 

We conducted a broad assessment of spatio-temporal divergence 
patterns and of the assembly of the Chinese angiosperm flora, using 
a robustly dated phylogeny as well as species distribution data (i) to 
document the relative proportions of ancient and recent divergences 
that shaped the extant Chinese angiosperm flora in various geographic 
regions; (ii) to investigate the differential spatio-temporal divergence 
patterns of woody and herbaceous genera and their relationships with 
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Figure 1 | Patterns of the MDTs for Chinese angiosperm genera. 

a-i, MDT for all genera, woody genera and herbaceous genera (from left 
to right), based on all sampled genera (a-c), the youngest 25% of genera 
(d-f), and the oldest 25% of genera (g-i) in each grid cell. j-1, Null- 
model test to recognize recent (blue grid cells) and ancient (red grid 
cells) divergence centres. The analyses included 2,592 angiosperm 


environmental variables; and (iii) to compare genus- and species-level 
measures of phylogenetic diversity and explore their conservation 
implications for the Chinese flora. 

Our phylogeny resolved evolutionary relationships among all major 
angiosperm lineages in China (Extended Data Fig. 1), yielding topolo- 
gies that are highly similar to those for angiosperms as a whole””!. 
Our estimates of divergence times based on penalized likelihood and 
PATHdS8 are congruent with one another, and agree with those obtained 
in recent studies of angiosperms on a global basis”””* (Extended 
Data Fig. 2). Divergence time estimates show that 66% of Chinese 
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genera (woody genera, n = 925; herbaceous genera, n = 1,501; genera 
with both woody and herbaceous species, n = 166). Maps adapted from 
National Administration of Surveying, Mapping and Geoinformation 
of China (http://www.sbsm.gov.cn; review drawing number: 
GS(2016)1576). 


angiosperm genera originated during the Neogene and Quaternary 
periods; the remaining genera diverged in the Palaeogene (29%) and 
Cretaceous (5%) periods. Additionally, the herbaceous genera have 
diversified much more rapidly than the woody genera during the past 
30 million years (Extended Data Fig. 3). 

We divided China into 100-km x 100-km grid cells, evaluated age 
variance within grid cells (Extended Data Figs 4, 5), and calculated 
mean divergence times (MDTs) and median divergence times of genera 
within each grid cell (Fig. 1; Extended Data Figs 6, 7; Supplementary 
Information). Mapping the MDTs of all genera revealed a transition belt 
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Figure 2 | Spatio-temporal divergence 
patterns of the Chinese angiosperm flora. 

a, Patterns of MDTs adapted from Fig. la. 
The dark line represents the 500-mm annual 
precipitation isoline adapted from ref. 24 
(reprinted from ref. 24, with permission from 
Elsevier). b, Ordinal time-tree with the major 
clades of angiosperms indicated. The top five 
orders with genera occurring only in western 
China and top 20 orders with genera occurring 
only in eastern China are indicated on the 
tree with blue and red boxes, respectively. 
The number of genera distributed in western 
or eastern China from each order is shown 
within the corresponding box. Map adapted 
from National Administration of Surveying, 
Mapping and Geoinformation of China 
(http://www.sbsm.gov.cn; review drawing 
number: GS(2016) 1576). 
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that coincides with the modern 500-mm isoline of annual precipitation, 
which marks the boundary between humid-semi-humid and arid- 
semi-arid areas”* (eastern China versus western China, Fig. 2). Both 
MDT and null-model analyses indicate that eastern China has older 
lineages (red grid cells, Fig. 1a, j), particularly in central to southern 
China. By contrast, western China, and especially the Qinghai-Tibetan 
Plateau, contains taxa that have diverged more recently (blue grid cells, 
Fig. la, j). Furthermore, our genus-level analyses demonstrate that 
eastern China is phylogenetically overdispersed with higher phylo- 
genetic diversity, and that western China shows phylogenetic clustering 
with lower phylogenetic diversity (Extended Data Fig. 8). These 
findings are also observed in analyses of phylogenetic diversity based 
on multiple species-level trees, in which taxa that lacked target DNA 
sequences were provided with meaningful branch lengths using a 
birth-death clock model (see Methods; Extended Data Fig. 9). The flora 
of the Cape of South Africa likewise shows phylogenetic structure—the 
western region is phylogenetically clustered, and the eastern region is 
overdispersed!°. However, taxon richness is decoupled from phylo- 
genetic diversity in the Cape of South Africa; in China, taxon richness 
and phylogenetic diversity are positively correlated. 

Western China includes the arid north-western portion of the 
country and most of the Qinghai-Tibetan Plateau (Fig. 2). A funda- 
mental climate shift may have occurred in western China as recently 
as the early Miocene, owing to the uplift of the Qinghai-Tibetan 
Plateau and subsequent development of the Asian monsoon**”’. Of 
the 111 genera that occur only in western China, 76% originated in 
the past 20 million years and only 24% originated before this time. In 
western China, a much higher percentage of herbaceous than woody 


Monocots Magnoliids&150 
genera has originated since 30 Mya (Fig. 3a). Moreover, genera that 
occur only in western China are predominantly members of only a 
few clades (Apiales, Asterales and Brassicales), most of which have 
much younger divergence times than the major clades of eastern China 
(Fig. 2; Extended Data Table 1). MDTs calculated from the youngest 
25% of herbaceous genera in each grid cell also indicate that western 
China—in particular the Qinghai-Tibetan Plateau—has younger 
lineages (Fig. 1f) than eastern China, which further suggests that 
western China represents a cradle for herbaceous angiosperms. 
Mountainous areas of eastern China have been proposed as 
refugia for plants that originated in the early Cretaceous or late 
Jurassic periods*®’” because their geological environment and 
climate (including orogenic movements, annual temperature and 
annual precipitation) may have experienced little change since the 
Cretaceous”®. Of the 1,026 genera that occur only in eastern China, 39% 
originated before 20 Mya and 61% arose more recently than this. Both 
herbaceous and woody genera diverged at similar rates throughout 
geological time (Fig. 3a). The 20 major clades with the largest number 
of genera occurring only in eastern China are distributed throughout 
the ordinal-level time-tree from early-diverging clades (for example, 
Alismatales, Asparagales, Magnoliales and Ranunculales) to later- 
diverging lineages (for example, Asterales, Gentianales and Lamiales) 
(Fig. 2; Extended Data Table 1). MDTs based on the youngest 25% and 
oldest 25% of genera in each grid cell reveal that eastern China has 
old herbaceous lineages (Fig. 1f, i), but has both old and young woody 
lineages (Fig. le, h). Eastern China may have served as a museum for 
herbaceous genera, but as both a museum and a cradle for woody 
genera. 
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Figure 3 | Angiosperm divergence pattern and conservation priorities 
in western and eastern China. a, Percentage of genera occurring only in 
western (nm = 111) or eastern China (n = 1,026) during geological time. 
Western China has a higher percentage of herbaceous genera (purple 
dashed line) than woody genera (purple solid line) that have originated 
since 20 Mya. Western and eastern China are divided by the 500-mm 


The mean annual precipitation (MAP) and mean annual temperature 
(MAT) have higher explanatory power for the MDTs of the herbaceous 
genera (Fig. 4c, f) than of the woody genera (Fig. 4b, e). These patterns 
may reflect the heterogeneity in rates of evolution between herba- 
ceous and woody lineages. Herbaceous plants are well known to have 
higher substitution rates owing to their shorter generation times, which 
perhaps allows them to respond more quickly to environmental change 
through increased genetic divergence and speciation rates™*”?, 

The spatial divergence and diversity patterns of angiosperms 
detected here do not precisely reflect the latitudinal gradient in China; 
MDT and phylogenetic diversity decrease from south-east to north- 
west (Fig. 2a; Extended Data Fig. 8d, g). Our results show the impor- 
tance of water and temperature in limiting the dispersal of species from 
humid and warm regions to drier and colder areas. The effects of topo- 
graphy, with a pronounced altitudinal gradient increasing from east to 
west, and the monsoon climate in eastern Asia are so extensive that the 


isoline of annual precipitation. Plio., Pliocene epoch; Plt., Pleistocene 
epoch. b, Grid cells with the top 5% highest phylogenetic diversity and 
SES-PD at genus (pink) and species (blue) levels. Protected areas are 
highlighted in green. Maps adapted from National Administration of 
Surveying, Mapping and Geoinformation of China (http://www.sbsm.gov. 
cn; review drawing number: GS(2016)1576). 


decreasing temperature and precipitation gradients from south-eastern 
to north-western China are not consistent with the latitudinal gradient, 
as might be expected in flatter regions. 

On the basis of a species-level phylogenetic tree and distribution 
data with ‘county’ as the basic unit, we inferred that the species richness 
and phylogenetic diversity in protected areas cover approximately 88% 
and 96%, respectively, of the total species richness and phylogenetic 
diversity in China. For conservation planning, these values may be over- 
estimates that result from the coarse scale of our distributional data, as 
most nature reserves are smaller in size than Chinese counties. Notably, 
areas with the top 5% highest phylogenetic diversity and standard 
effective size of phylogenetic diversity (SES-PD) are mainly located 
in several provinces of eastern China (Fig. 3b): Guangdong, Guangxi, 
Guizhou and Hainan for genus-level phylogenetic diversity hotspots, 
and Yunnan for species-level phylogenetic diversity. These areas are 
also hotspots for threatened plants in China*°. However, in contrast 
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to western China, protected areas in eastern China are fragmented 
(Fig. 3b), largely as a result of urbanization and administrative division. 
Our data suggest the need to establish more connections between 
existing nature reserves and national parks that span provincial borders 
to conserve plant lineages of ancient and recent origins in eastern 
China, as well as the other organisms that depend on these floristic 
elements. These findings should be of broad interest to evolutionary 
and conservation biologists, and serve to stimulate better-informed 
conservation planning and research. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Phylogeny reconstruction. Sequences of four plastid genes (atpB, matK, ndhF and 
rbcL) and one mitochondrial gene (matR) were used to reconstruct the phylogeny 
of Chinese vascular plants*!. Generic circumscriptions were based on ref. 15. We 
sampled one species for the 1,173 genera with only one species in China. For the 
1,736 genera with 2-30 species in China, two species were sampled from each 
genus. For the 267 genera with more than 30 species in China, approximately 10% 
of the species of each genus were sampled, reflecting intrageneric diversity. We 
downloaded all available sequences for the target DNA regions from GenBank; if 
more than one sequence was available for the same locus for a species, the longest 
one of good quality was selected. For genera with sequences that were unavailable 
in the public database (781 genera in total), we generated new sequences from 
leaf materials, collected from the field for 513 genera and from specimens from 
the Chinese National Herbarium (PE) for 47 genera. There are 231 genera that 
remain unavailable because we failed to obtain the materials or amplify the target 
sequences. Details of DNA extraction, PCR, sequencing, alignment, accession 
numbers of sequences and phylogeny reconstruction have previously been 
published*". 

Divergence time estimation. We used the penalized likelihood method as imple- 
mented in treePL* (https://github.com/blackrim/treePL) to date divergence times 
of Chinese angiosperms based on the optimal maximum likelihood phylogram 
obtained with RAxML 8.0.22* in the CIPRES Science Gateway™, after excluding 
the outgroups (for example, lycophytes, ferns and gymnosperms). Our dated 
phylogeny included 5,864 species native to China, representing 2,665 genera 
from 273 families or approximately 92% of the angiosperm genera of China. We 
validated the available fossils and selected 138 calibrations for dating analyses 
(Supplementary Table 1 in Supplementary Information). The ‘prime’ option was 
applied to identify the best optimization parameters, and a ‘thorough’ analysis was 
then carried out with the optimal parameters determined above (opt = 1, optad=1 
and optcvad = 4). To identify the best smoothing parameter that affects the penalty 
for rate variation over the phylogram, a ‘random subsample and replicate cross- 
validation was conducted with treePL. Confidence intervals for each node were 
calculated following previously published methods”. To accommodate for 
variation in branch length estimates, we calculated 100 bootstrap replicates with 
topology fixed to the above maximum likelihood phylogram but with varying 
branch lengths. We then conducted treePL on these 100 replicates. Age statistics 
for all nodes were summarized with TreeAnnotator v.1.8.4*. 

We also used an alternative dating method, PATHd8"*, to estimate divergence 
times of Chinese angiosperms. The calibrations for the PATHd8 analysis were 
identical to those used for the treePL analysis, except that the crown age of angio- 
sperms was set to 138 Mya instead of a maximum of 140 Mya and a minimum 
of 136 Mya (as in treePL) because PATHd8 requires one fixed calibration. Both 
treePL and PATHd8 are rate-smoothing methods, but PATHd8 sequentially 
takes averages over path lengths from an internode to all its descending termi- 
nals, one pair of sister groups at a time*’, where smoothing is done stepwise for 
each node separately; by contrast, smoothing in treePL is done simultaneously 
over the tree. The correlation between ages at all nodes based on the treePL 
and PATHdB8 analyses was assessed with Spearman’s rank correlation analysis 
in Rv.3.2.0°. 

To evaluate whether dates for this regional time-tree are biased owing 
to the geographic sampling, we used a correlation analysis to compare our 
estimated divergence times with recent global-scale angiosperm time-tree 
reconstructions”*”; one of these represents a family-level time-tree with multiple 
fossil calibration points”, and the other is a species-level time-tree with dense 
taxon sampling (32,223 species) and fewer calibrations”. The stem age of each 
family was extracted for the Spearman's rank correlation analyses. Only the family 
ages were compared (circumscription of families, following ref. 20), because 
different genera and species were included in the three studies. Ages of genera 
were extracted from our dichotomous time-tree estimated by treePL for the down- 
stream analyses. For monophyletic genera, stem ages were extracted directly by 
tracing their stem node. For genera that are polyphyletic or paraphyletic (380 out 
of 2,665), the stem age of each monophyletic lineage was extracted and the oldest 
one was selected as the age of the genus. The numbers of angiosperm genera that 
originated during specified geological timespans are provided in Extended Data 
Fig. 3, with the global temperature changes since 65 Mya* indicated. 
Distribution of angiosperm species in China. The spatial distribution data and 
information on growth form were assembled from nearly all published national 
and provincial floras, as well as some local floras, checklists and herbarium records. 
The spatial distribution data are at the county level (2,377 counties) with an average 
county-size of approximately 4,000 km’. To minimize the sampling bias of unequal 
sampling areas, we divided the map of China into 100-km x 100-km grid cells, and 
grid cells on the border that cover less than 50% of the area of a grid cell (that is, 
5,000 km”) were excluded from the analyses. Maps of China used in this study were 


adapted from standard maps released by the National Administration of Surveying, 
Mapping and Geoinformation of China (http://www.sbsm.gov.cn; review drawing 
number: GS(2016)1576). The gridded distribution database contained 1,409,239 
occurrence records for 26,978 angiosperm species from 2,845 genera. After match- 
ing with the phylogeny, the final dataset included a total of 2,592 angiosperm 
genera (woody genera, n = 925; herbaceous genera, n = 1,501; genera with both 
woody and herbaceous species, n = 166). 

Spatial distribution of MDTs and null-model test for divergence hotspots. 
To explore the spatial divergence patterns of Chinese angiosperm genera, we 
calculated the weighted MDTs of all genera in each grid cell by integrating spatial 
distribution data with our dated phylogenetic tree. AGE; represented the age of a 
genus i (i=1, ..., n) ina grid cell, and S; the species number in genus jin this grid 
cell. From this, MDT was calculated as: 


(AGE, x S1) + (AGE2 x $2) + (AGE; x $3) + +: + (AGEn X Sn) 
Si +S2+$3+-- +S, 


MDT= 


We further divided the genus dates in each grid cell into quartiles and calculated 
MDTs on the basis of the youngest and oldest quartiles, separately, in each grid cell. 
The MDTs based on the youngest quartile allowed us to recognize centres of recent 
divergence, whereas MDTs based on the oldest quartile detected ancient centres 
of divergence. To avoid potential bias from grid cells that had either relatively old 
or young genera, we ranked all genera from youngest to oldest, partitioned them 
into quartiles based on their ages, computed MDT in each cell for the absolute 
youngest 25% and the absolute oldest 25% of genera, and then mapped the results 
across China. 

We designed a null model to identify ancient and recent divergence hotspots for 
the angiosperm flora of China. The mean ages of the youngest and oldest quartiles 
in each grid cell were selected as the observed values for the null models, and 
then we shifted the genera randomly using all genera investigated in China as a 
sampling pool to obtain the null distributions of ages for the youngest and oldest 
quartiles for each grid cell. The standardized effect size of the mean divergence time 
(SES-MDT) of genera for each grid cell was calculated as: 


MD observed —MDTrandom 
s.d.(MDT. 


random) 


SES-MDT = 


where MDTybserved is the observed MDT; MDTyandom is the expected MDT of the 
randomized assemblages (n = 999); and s.d.(MDTyandom) is the s.d. of the MDT for 
the randomized assemblages. Grid cells with values of SES-MDT for the youngest 
quartile below —1.96 were identified as notable hotspots of recent divergence, 
whereas grid cells with SES-MDT for the oldest quartile above 1.96 were identi- 
fied as notable hotspots of ancient divergence. Considering that the evolutionary 
history of herbaceous and woody plants may differ*’, the above analyses were also 
conducted separately for herbaceous and woody genera. Analyses of MDT were 
implemented in R and ArcGIS 10.1 (http://www.esri.com/). 

Previous studies have demonstrated that the overall species richness patterns 

of birds are largely determined by the geographically wide-ranging species*’”, 
indicating that patterns may be driven by a subset of taxa and may not be 
representative of an entire biota. To explore whether MDT patterns for China are 
influenced largely by values for widespread species, we ranked genera from the 
most restricted to most widespread in China, partitioned the genera into quartiles 
on the basis of their range size and mapped MDT for each quartile following a 
previously published description“. 
Spatial distribution of median divergence times. Age variation within grid cells 
was evaluated by plotting divergence times in each grid cell (Extended Data Fig. 4) 
and calculating the skewness and kurtosis of divergence times (Extended Data Fig. 5). 
To verify the results of MDT, we also investigated the distribution patterns of the 
Chinese angiosperm genera by mapping the median divergence times (medianDT) 
based on all genera, and the youngest and oldest quartiles in each grid cell. The null 
model for the median divergence time applied a modified effective-size statistic’ *° 
and was calculated as: 


medianDTobservea — medianDTyandom , if MAD 
1.4826 x MADrandom 


SES-medianDT = 


random > 0 


medianDTobservea — medianDT, 


andom ; 
» if MADyandom = 0 
1.2553 x meanADyandom — 


SES-medianDT = 


where medianDTobserved is the observed median divergence time; medianDTyandom 
is the expected median divergence time of the randomized assemblages (n = 999); 
MAD random is the median absolute deviation of the divergence times for the 
randomized assemblages; and meanAD,andom is the mean absolute deviation of 
the divergence times for the randomized assemblages. 
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Richness and phylogenetic diversity. We calculated the generic richness, Faith’s 
phylogenetic diversity’” and SES-PD of the Chinese angiosperm genera on the 
basis of our ultrametric chronogram using the ‘picante’ package in R. Faith’s phylo- 
genetic diversity is the sum of all phylogenetic branch lengths that connect species 
in a community. We calculated phylogenetic diversity as the length of the subtree 
that joins the genera in each grid cell to the root of the chronogram. SES-PD was 
calculated because phylogenetic diversity is usually positively correlated with spe- 
cies richness“*. We first obtained a null distribution of the expected phylogenetic 
diversity values by shuffling taxa labels across the tips of the tree 999 times for 
each grid cell. SES-PD was then calculated by dividing the difference between the 
observed (PDobservea) and expected phylogenetic diversity (PDyandom) by the s.d. 
of the null distribution (s.d.(PDyandom)): 


SES-PD = PDobserved = PDyandom 
s.d.(PDyandom) 


Phylogenetic structure. The net relatedness index (NRI) and the nearest taxon 
index (NTI) were calculated to investigate the phylogenetic structure (clustering 
or overdispersion) of angiosperm genera across China’. NRI is based on the mean 
phylogenetic distance (MPD), an estimate of the average phylogenetic relatedness 
between all possible pairs of taxa within a grid cell, and primarily reflects structure 
at deeper parts of the phylogeny. NTI is based on mean nearest taxon distance 
(MNTD), an estimate of the mean phylogenetic relatedness between each pair of 
taxa in a grid cell and its nearest relative in the phylogeny, and reflects shallower 
parts of the phylogeny. NRI and NTI were calculated as follows: 


MPDobserved ~ MPDrandom 
s.d. (MPDyandom) 


NRI 1x 


MNTDobserved = MNTDyandom 
s.d. (MNTDyandom) 


NTI 1x 


where MPDobservea aNd MNTDobservea are the observed MPD and MNTD; 
MPDyandom and MNTDyandom are the averages of the expected MPD and MNTD of 
the randomized assemblages (n = 999); and s.d.(MPDyandom) and s.d.(MNTDyandom) 
are the standard deviation of MPDyandom and MNTDyandom for the randomized 
assemblages. The null distributions of MPD and MNTD were created by randomly 
selecting the observed number of genera in each grid cell 999 times, with all genera 
in the phylogeny as a sampling pool. Positive values of NRI and NTI indicate 
phylogenetic clustering, whereas negative values indicate phylogenetic overdisper- 
sion in a grid cell. NRI and NTI for woody and herbaceous genera were calculated 
separately to compare their phylogenetic structures across China. 

Regression analyses between MDT and two climatic variables. To explore the 
underlying mechanisms of spatial divergence patterns of the Chinese angiosperms, 
MDT in each grid cell was regressed against the respective mean values of MAP 
and MAT in each grid cell using the linear regression model in R. The adjusted R? 
was used to indicate the explanatory power of each variable, although it is clear that 
these associations do not necessarily indicate causation of the climatic variables in 
determining MDT. Climatic data were downloaded from the WorldClim database 
Version 1.4 (http://www.worldclim.org/) with a spatial resolution of 10 min*°. 
Species tree reconstruction and conservation implications. With our dated 
genus-level chronogram as the backbone, a species-level tree including 26,978 
Chinese angiosperm species was generated by inserting species that were not 
sampled in our generic tree within the genera to which they belong using the 
R package ‘S.PhyloMaker’*!. Our species-level tree included approximately 96% 
of all known angiosperm species native to China; 1,098 aquatic species were not 
sampled. To mitigate the effect of polytomies on the calculation of phylogenetic 
diversity, we resolved polytomies in the reconstructed tree with a birth-death clock 
model*”. We constructed constraints based on the tree constructed with molecular 
data, and unresolved taxa were then placed within the relevant constraints. Node 
heights for each constraint were fixed on the basis of divergence time estimates. 
We then conducted a Bayesian analysis using MrBayes v.3.2°° with the topological 
and node height constraints and with the birth-death (speciation and extinction) 
priors as uniform (0.0, 10.0). Two analyses were run for 2,500,000 generations, 
sampling every 500 generations, to ensure convergence and mixing; the first 
750,000 generations were discarded as burn-in, and 1,000 of the post-burn-in 
trees were retained for further analyses. The species-level phylogenetic diversity 
and SES-PD were calculated on the basis of 10 trees randomly selected from the 
1,000 trees. The Spearman's rank correlation was used to assess the consistency 
of phylogenetic diversity or SES-PD patterns based on different trees. Grid cells 
with the top 5% highest values of both phylogenetic diversity and SES-PD were 
identified as hotspots of phylogenetic diversity (Fig. 3b). MDT analyses were not 
conducted on the species tree as the missing data rendered the variation between 
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replicates uninformative. Once additional molecular information is collected for 
these species, further analyses can be performed. 

Spatial data of protected areas in China were compiled from two sources: 
(i) a previous publication™ that digitized nature reserves in mainland China, which 
included 334 national, 857 provincial and 1,431 prefectural or county-level nature 
reserves (provided by Z.-Y. Tang); and (ii) 92 protected areas in Taiwan, down- 
loaded from the Database of Protected Areas (https://www.protectedplanet.net/; 
accessed August 2017). Considering that most of the nature reserves were designed 
according to administrative units, we calculated richness and phylogenetic diversity 
in the protected areas with ‘county’ as the basic unit rather by than dividing China 
into grid cells. Each conservation area was intersected with the map of China to 
produce the protected areas in ArcGIS. Species occurring in these counties are 
supposed to be protected, but counties with protected areas that covered less than 
10% of the area of a county were excluded to reduce sampling bias. 
Statistics and reproducibility. No statistical methods were used to predetermine 
sample size. Spearman's rank correlation and linear regression analyses were 
conducted in R. Precise P values are provided to show statistical significance. 
Null-model tests (999 random replicates) were used to assess the significance of 
spatial diversity and divergence distributions with —1.96 and 1.96 as significant 
boundaries. 
Code availability. Example code used to conduct null-model test (written in R) 
can be found at Dryad: http://datadryad.org/resource/doi:10.5061/dryad.p89m3. 
Data availability. Sequences for phylogenetic analyses have previously been 
published?! and deposited in GenBank. The dated phylogeny is archived in Dryad: 
http://datadryad.org/resource/doi:10.5061/dryad.p89m3. The spatial distribu- 
tion data are available from: http://www.darwintree.cn/resource/spatial_data. All 
other additional data are available from the corresponding author upon reasonable 
request. 
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Extended Data Figure 1 | Dated megaphylogeny of the Chinese angiosperms. Major clades, including magnoliids, monocots, superrosids and 
superasterids, as well as the basal eudicot grade, are indicated with different colours. Divergence times were estimated using treePL. 
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Extended Data Figure 2 | The 95% confidence intervals of divergence 
times and the Spearman’s rank correlation between our dating and 
those of recent publications. a, b, Plots of divergence times and 95% 
confidence intervals (grey bars) for each family (a, n = 273) and genus 

(b, n= 2,909). The centre values are ages calculated based on the optimal 
maximum likelihood tree. c, Correlation of nodal ages between treePL and 
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PATH48 in this study (n = 5,863; r= 0.94, P=0). d, Correlation of family 
ages between treePL and ref. 22 (n =236, r=0.68, P=1.17 x 10~*9). 

e, Correlation of family ages between treePL and ref. 23 (n = 257; r=0.55, 
P=4,54 x 10 **). f, Correlation of family ages between ref. 22 and ref. 23 

(n= 235; r=0.75, P=2.11 x 10-*’). The solid line is y=x. 
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Extended Data Figure 3 | Number of angiosperm genera that originated 
during specified geological timespans. Column with three colours shows 
the number of woody (grey), herbaceous (yellow) and mixed genera (light 
blue) that originated within a specific geological timespan. Number of 
woody genera, n = 995; number of herbaceous genera, n = 1,569; mixed 
genera (genera with both woody and herbaceous species), n = 101. The 
dashed line indicates the accumulated percentage of genera that have 


originated since the Early Cretaceous. Global temperature changes that 
have occurred since the Palaeogene are shown by the red curve (from 
ref. 39; reprinted with permission from AAAS). The x axis indicates the 
geological period and time in millions of years. The left y axis shows the 
total number of genera that have originated by any given time period; 
the right y-axis represents the accumulated percentage of genera that 
originated within a geological time period. 
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Extended Data Figure 5 | Histograms and distribution of skewness and 
kurtosis for divergence times in each grid cell. a—c, Range of skewness for 
all genera (a), woody genera (b) and herbaceous genera (c). d-f, Range of 
kurtosis (computed as the fourth standardized moment) for all genera (d), 
woody genera (e) and herbaceous genera (f). g-i, Spatial distribution of 
skewness for all genera (g), woody genera (h) and herbaceous genera (i). 
j-l, Spatial distribution of kurtosis for all genera (j), woody genera (k) and 
herbaceous genera (1). Skewness values in most grid cells are positive and 
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around 1-2, which implies that divergence times of genera are slightly 
right-skewed (there are more young ages in each grid cell). Kurtosis values 
in most grid cells are within a range of 4-8, larger than the value (3) for 

a normal distribution, which implies that the distribution of divergence 
times has more extreme outliers than the normal distribution. For eastern 
China, kurtosis values of approximately 4 for all genera are consistent with 
grid cells having a range of divergence times—including very young and 
very old ages—as expected for an area that is both a cradle and a museum. 
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and Geoinformation of China (http://www.sbsm.gov.cn; review drawing 


grid cells) and ancient (red grid cells) divergence centres for all genera (j), number: GS(2016)1576). 
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Extended Data Figure 7 | Spatial distribution of MDTs based on 
geographic range-size quartiles and the youngest 25% and oldest 25% of 
genera in China. a-d, MDT patterns of the first (a), second (b), third (c) 
and fourth quartiles (d) of the sampled Chinese angiosperm genera. 
The first, second, third and fourth quartiles range from the narrowest 
to the widest geographic distribution, and represent 0.6%, 3.5%, 13.7% 
and 82.1% of 1,409,239 records, respectively. The Spearman's rank 
correlation coefficients between the overall MDT (including all genera) 


b MDT of 2 quartile (3.9% records, r= 0.59) 
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f MDT of 25%-oldest genera 
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Ml 42.79-44.521 49.02-50.51 
MN 44.53-45.68 MM 50.52-52.04 
1145.69-46.68 MM 52.05-53.53 
— 46.69-47.74M53.54-56.02 
and MDT of the first, second, third and fourth geographic quartile are 
0.12 (P=1.46 x 1074), 0.59 (P= 1.21 x 10-*”), 0.43 (P=2.51 x 10-) and 
0.99 (P=0), respectively. e, MDT pattern of the youngest 25% of genera in 
China, showing that there are young genera in both western and eastern 
China. f, MDT pattern of the oldest 25% of genera in China, confirming 
that older genera mainly occur in eastern China. Maps adapted from 
National Administration of Surveying, Mapping and Geoinformation of 
China (http://www.sbsm.gov.cn; review drawing number: GS(2016)1576). 
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Extended Data Figure 8 | Patterns of generic richness, phylogenetic herbaceous genera (1). m-o, NTI for all genera (m), woody genera (n) 
diversity and phylogenetic structure for the Chinese angiosperm and herbaceous genera (0). The analyses include 2,592 angiosperm 
genera. a—c, Richness for all genera (a), woody genera (b) and herbaceous genera (woody genera, n = 925; herbaceous genera, n = 1,501; genera 
genera (c). d-f, Phylogenetic diversity for all genera (d), woody genera (e) with both woody and herbaceous species, n = 166). Maps adapted from 


and herbaceous genera (f). g-i, SES-PD for all genera (g), woody genera (h) National Administration of Surveying, Mapping and Geoinformation of 
and herbaceous genera (i). j-1, NRI for all genera (j), woody genera (k) and China (http://www.sbsm.gov.cn; review drawing number: GS(2016) 1576). 
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Extended Data Figure 9 | Patterns of species-level phylogenetic diversity species tree 461. The analyses include 26,978 angiosperm species (woody, 


for all Chinese angiosperms. a—1, Observed phylogenetic diversity for n= 10,169; herbaceous, n = 16,809). Phylogenetic diversity and SES-PD 
all species (a, d, g, j), woody species (b, e, h, k) and herbaceous species based on 10 species trees produce similar patterns; Spearman’s rank 

(c, f, i, 1) based on species trees 210, 30, 174 and 461 (species trees were correlation coefficients, r > 0.99, P< 2.20 x 10716. Maps adapted from 
randomly selected from 1,000 post-burn-in trees). m-o, SES-PD for National Administration of Surveying, Mapping and Geoinformation of 
all species (m), woody species (n) and herbaceous species (0) based on China (http://www.sbsm.gov.cn; review drawing number: GS(2016)1576). 
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Extended Data Table 1 | Number of genera that occur only in western or eastern China, with the number of woody, herbaceous and mixed 
genera in each order indicated 


Rank Order No. of genera Woody Herbaceous Mixed 

1 Asterales 26 0 26 0 

2 Brassicales 23 0 23 0 

| Caryophyllales 18 3 14 1 

4 Apiales 10 0 10 0 

5 Fabales 8 4 3 1 

s 6 Poales 6 2 4 0 
= 7 Lamiales 5 0 5 0 
. 8 Malvales 3 0 3 0 
3 9 Asparagales 3 0 3 0 
3 10 Boraginales 2 0 2 0 
= i Alismatales 1 0 1 0 
12 Ranunculales 1 0 1 0 

13 Malpighiales 1 0 1 0 

14 Myrtales 1 0 1 0 

15 Rosales 1 1 0 0 

16 Saxifragales 1 0 1 0 

17 Zygophyllales 1 1 0 0 
1 Lamiales 112 25 72 15 

2 Gentianales 102 82 18 2 

3 Asparagales 98 1 96 | 

4 Poales 81 16 65 0 

5 Malpighiales 72 60 9 3 

6 Fabales 54 37 16 1 

7 Sapindales 46 43 3 0 

8 Asterales 45 7 35 3 

9 Ericales 32 or 5 0 

10 Alismatales 30 0 30 0 

11 Malvales 29 24 3 2 

12 Myrtales 29 24 3 2 

13 Magnoliales 25 25 0 0 

14 Rosales 24 20 4 0 

15 Ranunculales 23 14 9 0 

16 Saxifragales 19 14 5 0 

17 Caryophyllales 18 3 14 1 

18 Solanales 17 8 6 3 

19 Zingiberales 16 1 15 0 

20 Apiales 15 8 6 1 

g 21 Cucurbitales 13 2 11 0 
| 22 Laurales 13 12 1 0 
i 23 Arecales 13 13 0 0 
iP} 24 Santalales 12 12 0 0 
a 25 Cornales 10 7 3 0 
= 26 Brassicales 9 5 4 0 
27 Celastrales 9 9 0 0 

28 Icacinales 7 7 0 0 

29 Oxalidales 6 5 1 0 

30 Fagales 6 6 0 0 

31 Commelinales 6 0 6 0 

32 Boraginales 5 1 4 0 

33 Liliales 4 0 4 0 

34 Aquifoliales 3 2 1 0 

35 Pandanales 3 1 2 0 

36 Dipsacales 3 3 0 0 

37 Piperales 3 0 3 0 

38 Dioscoreales 2 0 2 0 

39 Dilleniales 2 2 0 0 

40 Chloranthales 2 0 1 1 

41 Huerteales 2 2 0 0 

42 Nymphaeales 1 0 1 0 

43 Proteales 1 1 0 0 

44 Petrosaviales 1 0 1 0 

45 Metteniusales 1 1 0 0 

46 Escalloniales 1 1 0 0 

47 Vitales 1 1 0 0 


Mixed, genera with both woody and herbaceous species. 
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Enhancer redundancy provides phenotypic 
robustness in mammalian development 


Marco Osterwalder!, Iros Barozzi!, Virginie Tissiéres2°, Yoko Fukuda-Yuzawa!, Brandon J. Mannion!, Sarah Y. Afzal', 
Elizabeth A. Lee!, Yiwen Zhu!, Ingrid Plajzer-Frick!, Catherine S. Pickle!, Momoe Kato!, Tyler H. Garvin!, Quan T. Pham!, 
Anne N. Harrington!, Jennifer A. Akiyama!, Veena Afzal!, Javier Lopez-Rios”’, Diane E. Dickel!, Axel Visel+*° & 


Len A. Pennacchio!*"® 


Distant-acting tissue-specific enhancers, which regulate gene 
expression, vastly outnumber protein-coding genes in mammalian 
genomes, but the functional importance of this regulatory 
complexity remains unclear”. Here we show that the pervasive 
presence of multiple enhancers with similar activities near the 
same gene confers phenotypic robustness to loss-of-function 
mutations in individual enhancers. We used genome editing to 
create 23 mouse deletion lines and inter-crosses, including both 
single and combinatorial enhancer deletions at seven distinct 
loci required for limb development. Unexpectedly, none of the 
ten deletions of individual enhancers caused noticeable changes 
in limb morphology. By contrast, the removal of pairs of limb 
enhancers near the same gene resulted in discernible phenotypes, 
indicating that enhancers function redundantly in establishing 
normal morphology. In a genetic background sensitized by reduced 
baseline expression of the target gene, even single enhancer deletions 
caused limb abnormalities, suggesting that functional redundancy is 
conferred by additive effects of enhancers on gene expression levels. 
A genome-wide analysis integrating epigenomic and transcriptomic 
data from 29 developmental mouse tissues revealed that mammalian 
genes are very commonly associated with multiple enhancers that 
have similar spatiotemporal activity. Systematic exploration of three 
representative developmental structures (limb, brain and heart) 
uncovered more than one thousand cases in which five or more 
enhancers with redundant activity patterns were found near the 
same gene. Together, our data indicate that enhancer redundancy 
is a remarkably widespread feature of mammalian genomes that 
provides an effective regulatory buffer to prevent deleterious 
phenotypic consequences upon the loss of individual enhancers. 

Enhancers are a principal class of cis-regulatory elements that 
orchestrate precise gene expression patterns, which are essential for 
numerous processes including embryonic development”. They are now 
routinely predicted by genome-wide chromatin profiling methods, 
which identify positions of open chromatin or enhancer-associated 
histone marks’. Enhancers predicted by these high-throughput 
approaches outnumber genes by approximately an order of magnitude’, 
raising the question of their functional significance. In particular, it 
remains unclear whether mammalian enhancers typically regulate 
complementary spatiotemporal aspects of gene expression in an 
additive fashion*”, or if this regulatory complexity more commonly 
results in functional redundancy among enhancers associated with 
the same gene*"!°. 

Using the developing limb as a model for gene regulation during 
morphogenetic processes'!"!?, we investigated the functional impor- 
tance of enhancers in vivo. We used CRISPR-Cas9 genome editing to 
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Figure 1 | Lack of limb morphological abnormalities in ten enhancer 
deletion lines. a, All selected enhancers are active in the limb mesenchyme 
(blue shading) at E11.5, are marked by epigenomic H3K27 acetylation 
and DNase I hypersensitivity (DHS) at E11.5, and contain a conserved 
core sequence (Cons). Target gene expression and limb morphology were 
assessed following deletion of individual enhancers (Extended Data 

Fig. 1a—j). b, None of the individual enhancer deletions caused obvious 
defects in the structure of skeletal elements. Enhancers are identified 

by VISTA ID numbers. Enhancer activities (left, E11.5) and forelimb 
skeletons of enhancer knockout (KO) embryos (right, E18.5) are shown 
(see Extended Data Fig. 3 for wild-type controls). Predicted target 

gene and enhancer distance (+, downstream; —, upstream) from the 
transcriptional start site (TSS) are indicated. n represents independent 
biological replicates with similar results. Scale bars, 100 1m (E11.5), 1mm 
(E18.5). 
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Figure 2 | Pairwise loss of limb enhancers with overlapping activities 
results in morphological abnormalities. a, b, CRISPR-deleted enhancers 
and their distance to the TSSs of predicted target genes (Gli3, Shox2). 

c, Left, RNA in situ hybridization reveals reduced Gli3 expression in 
anterior hand plates of mm1179/hs1586 DKO embryos (white arrowhead). 
Red arrowhead, local expansion of anterior mesenchyme, a hallmark 

of Gli3 deficiency. Right, forelimb skeletons with digits labelled 1 to 5, 
from anterior to posterior. DKO embryos exhibit duplication of digit 1 
(red arrowhead). Scale bars, 200 jum. WT, wild type. d, Reduced femur 
ossification length in hs741/hs1262 DKO embryos (normalized to tibia 


individually delete ten embryonic enhancers, each with strong evo- 
lutionary conservation and robust limb activity in transgenic mouse 
reporter assays!?-!” (VISTA Enhancer Browser: https://enhancer.lbl. 
gov/) (Fig. la, Extended Data Fig. la—j and Supplementary Table 1). 
Each enhancer (identified by VISTA ID number) is located in the 
vicinity of a gene associated with human congenital limb malforma- 
tions, and deletion of these genes in mice results in limb phenotypes 
ranging from polydactyly (Gli3) to complete loss of limbs (Fgf10) 
(Extended Data Fig. 1 and Supplementary Table 2). In all cases, the limb 
activity pattern of the enhancer at embryonic day 11.5 (E11.5) overlaps 
spatial RNA expression of the associated target gene, suggesting that 
these enhancers are part of the regulatory architecture that controls 
the expression of these genes! *! (Extended Data Fig. 2). Capture-C 
chromatin conformation data from embryonic limbs” confirmed that 
at least six of these enhancers physically interacted with their predicted 
target genes (Extended Data Fig. 1k). This framework enabled us to 
investigate the functional contribution of each enhancer by comparing 
the potential limb skeletal abnormalities caused by enhancer loss to the 
phenotypes observed in gene knockout mice. 

Unexpectedly, we did not detect any abnormalities in bone number, 
shape, length, position or mineralization in mice in which any of the 
ten single enhancers was deleted (Fig. 1b and Extended Data Fig. 3). 
Similarly, we observed neither significant differences in predicted target 
gene expression in embryonic limbs for nine out of ten individual 
enhancer deletions, nor obvious changes in local H3K27ac (acetylation 
of lysine 27 on histone H3) signatures outside the deleted enhancers 
(Extended Data Figs 2, 4). Together, these results suggest that a substan- 
tial proportion of limb enhancers, even if highly conserved in evolution, 
are not individually essential for normal limb morphogenesis. 
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ossification length). Box plot indicates median, interquartile values, range 
and individual biological replicates. ***P < 0.001 (two-tailed, unpaired 
t-test). e, f, Co-localization of Gli3 (e; mm1179, green; hs1586, red) and 
Shox2 (f; hs741, green; hs1262, red) enhancer activities via enhancer- 
reporter transgenes and immunofluorescence in forelimb buds of double 
transgenic embryos. White arrowheads indicate examples of double- 
positive cells. Empty arrowheads or arrows indicate cells marked by 
single enhancers. Nuclei are stained blue. Scale bars, 501m. n represents 
independent biological replicates with similar results. 


One possible explanation for the lack of an obvious phenotype in 
individual limb enhancer knockout lines is that different enhancers 
associated with the same gene may have spatiotemporally redundant, 
rather than unique, activity. Our selected panel of enhancers (Fig. 1b 
and Extended Data Fig. la—j) included three enhancer pairs with 
overlapping limb activity domains and the same predicted target gene 
(mm1179-hs1586, hs741-hs1262, and hs1467-mm636; Extended Data 
Fig. 5a—c). Using iterative CRISPR-Cas9 genome editing, we gene- 
rated double enhancer knockout (DKO) mice for each enhancer pair 
(Extended Data Fig. 5a—d, g, j), such that both deletions occurred in 
cis. In two out of three cases, involving enhancer pairs near Gli3 and 
Shox2, homozygous DKO embryos showed phenotypic abnormalities 
affecting skeletal limb morphology (Fig. 2a-d and Extended Data 
Fig. 5f, i, j). Mice lacking both enhancers near Gli3 (mm1179 and 
hs1586) had substantially reduced Gli3 expression in the embryonic 
hand plate and exhibited forelimb-specific polydactyly (Fig. 2c and 
Extended Data Fig. 5e, f), a phenotypic hallmark of diminished Gli3 
expression?>*, In addition, combined deletion of the two enhancers 
near Shox2 (hs741 and hs1262) reduced Shox2 expression, predomi- 
nantly in embryonic hindlimbs, and resulted in a marked reduction 
in femur ossification (Fig. 2d and Extended Data Fig. 5h, i), consis- 
tent with the stylopod reductions observed when the Shox2 gene is 
inactivated'*°. Together, these results show that although each of the 
four enhancers near Gli3 and Shox2 is individually dispensable for limb 
morphology, the respective pairs of enhancers are collectively required 
for normal limb development. 

To examine the degree of overlap between the activity patterns of 
phenotypically redundant enhancers at the cellular level, we generated 
transgenic mouse lines expressing fluorescent reporters under the 
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Figure 3 | Normally dispensable individual enhancers are required 

for limb morphology in a sensitized background. Individual and 
combined enhancer deletions in the presence of only one copy of the 

Gli3 (a) or Shox2 (b) target genes and the resulting limb morphology at 
E18.5. Wedges indicate inferred gene dosage. a, Skeletal forelimb autopod 
phenotypes at E18.5 resulting from mm1179 and hs1586 enhancer 
deletions in the presence of reduced Gli3 dosage. 1-5, normal digits. Red 
asterisk, extra digits with unclear identity. *s, ‘split’ digit. Black asterisk 
and arrowhead indicate the presence of hypoplastic distal phalanges. 


control of each of the Gli3 or Shox2 enhancers (mm1179-GFP, hs1586- 
mCherry, hs741-GFP and hs1262-mCherry). Using immunofluores- 
cence on limb sections from double transgenic embryos, we tracked the 
activity of each of the four enhancers during limb development (Fig. 2e, f 
and Extended Data Fig. 6). Consistent with the preaxial polydactyly 
observed in Gli3 DKO embryos, limb progenitor cells marked by both 
Gli3 enhancers were observed at high density in the anterior limb 
mesenchyme (Fig. 2e and Extended Data Fig. 6c, d). In Shox2 double 
enhancer reporter embryos, a major accumulation of cells with dual 
Shox2 enhancer activities is present in a proximal limb mesenchymal 
cell population known to harbour stylopod progenitors’ (Fig. 2f). In 
conjunction with our deletion studies, these results illustrate the degree 
of functional overlap between pairs of enhancers near the same gene 
at the cellular level. 

Considering the apparent contrast between the morphological 
redundancy of pairs of enhancers and the strong evolutionary conser- 
vation of each individual enhancer, we studied the phenotypic effect 
of single and combinatorial enhancer deletions in sensitized genetic 
backgrounds carrying heterozygous deletions of the presumptive target 
genes (Fig. 3). We used CRISPR-Cas9 to engineer Gli3 and Shox2 gene 
loss-of-function alleles, which recapitulated expected gene dosage 
reductions and previously published phenotypes (Extended Data 
Figs 7, 8). We then used these alleles to generate compound hetero- 
zygous mice harbouring one or more disrupted enhancers with a wild- 
type gene on one allele and a disrupted gene but wild-type enhancers 
on the other allele (Fig. 3). For Gli3, the absence of either enhancer 
(mm 1179 or hs1586) in the presence of only one functional G/i3 allele 
resulted in a supernumerary anterior digit (Fig. 3a and Extended 


b, Progressive reduction in femur ossification length (double arrows) due 
to hs741 and hs1262 enhancer loss in a Shox2-sensitized background. The 
relative length of femur ossification, normalized to the tibia ossification 
length, is shown. For comparison, the bottom panel shows absence 

of femur ossification in Shox2-deficient limbs at PO (red arrowhead, 
reproduced with permission from ref. 25). n represents number of 
independent biological replicates with similar results. Box plots indicate 
median, interquartile values, range and individual biological replicates. 
*P < (0.001 (two-tailed, unpaired t-test). Scale bars, 500 um. 


Data Fig. 8a), which is more severe than the terminally bifurcated 
thumb observed in Gli3 heterozygotes (Fig. 3a). Similarly, for Shox2 
the removal of either neighbouring enhancer (hs1262 or hs741) in 
combination with compound heterozygous deletion of the Shox2 
gene resulted in a more pronounced reduction in femur length than 
observed in Shox2 heterozygotes (Fig. 3b). For both pairs of enhancers, 
compound heterozygous mice carrying deletions of both enhancers on 
one allele and a deletion of the gene on the other allele showed even 
more severe phenotypes. In the case of Gli3, loss of both enhancers 
over a G/i3 null allele resulted in greatly reduced expression of Gli3 
(Extended Data Fig. 7b, c) and severe pre-axial polydactyly in forelimbs, 
similar in severity to homozygous loss of the Gli3 gene (Fig. 3a and 
Extended Data Fig. 8a). Likewise, compound heterozygous deletion 
of enhancers hs741 and hs1262 over a Shox2 gene deletion strongly 
reduced Shox2 expression (Extended Data Fig. 7e, f) and resulted in 
a severe reduction in femur length and substantial shortening of the 
humerus (Fig. 3b and Extended Data Fig. 8b, c), consistent with the 
phenotypes that result from homozygous Shox2 gene loss'*”>. Together, 
our data demonstrate that these developmental enhancers, although 
seemingly dispensable under non-sensitized conditions, show individ- 
ual functional contributions to limb development under conditions of 
reduced genetic robustness. 

The lack of phenotypic change upon deletion of individual enhancers, 
and the functional redundancy observed among enhancer pairs, raises 
the question of how commonly such redundancy occurs in mamma- 
lian gene regulatory landscapes. To explore this question systemati- 
cally, we devised a genome-wide, correlation-based computational 
approach to estimate the number of enhancers that regulate each gene 


00 MONTH 2018 | VOL 000 | NATURE | 3 


© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


LETTER 


a Genome-wide map of enhancer-gene assignments 


Se <7 (> 116 


Hi-C interaction counts 


1 Mb 
oO i i 
Q 3 LhxS i 
i © | Tbx5 Un | 
a 
ae Tbx3; [Il Il il SES TES | 
BMed713/ TbxSIH} Rabi9BHH = LhxSD | gombned 
“TAD: : 2 Tbx3 Sds/ ht enhances: 
[TAD: chrS:1 18749992-120469901  TDXSH tee ANY) Gotivities 
Tbx3 enhancers Lil | gene: 
Tox3 Til | mf | LIE] || expression 
Tbx5 enhancers 
Tbxs Ee | | 
Lhx5 enhancers im | i” la 
Lixé [a a 
MMNSHNS HIOHWSOHHHHSHSOHHSNHHOS 
a +e Fegargegeger yes ese 
bi bg b2b sbbbb sh Phi “uous 
eo oF p>Saayx,55T PS ees Vos 
a= SeSLeeSS Ss esssegs 
= Qed S3555 3 sSSFFE tial 
2re ZE-sta 
= 2S 550 
Ooo 
Zz 


b 
Housekeeping (n = 1,287) 4 oh: eecocccce o eee 8 
Limb TFs (n = 41) — + - RR - - - - - - - 4 
Forebrain TFs (n = 21) — + - (- - - -- doe 
Heart TFs (n = 27) — + --- [I - - - - - 4 
T T T T T 
0 10 20 30 40 
o No. enhancers per gene 
c = 
& 60 4 
5 | B Major limb developmental TFs 
Qa 
eg 7h 
oO 4 
2 * 
2 if 
gS 
oo 3 oe oe 
Ranked genes 
d 9» 4-5 
=] 
D tas O Enhancers other than limb 
6 25-7 Enhancers in limb 
Q @ Enhancers in limb E11.5 
2 20 
8 15 
FE | 
a cl Li 
oO 
oe of 
Zz o= 
S2YESSSER SITE 
S038 § 1OERSS 


Figure 4 | Enhancers with redundant signatures are prevalent 

near developmental genes. a, Enhancer-gene assignments based on 
correlation of H3K27ac and mRNA profiles across a wide array of tissues 
(Extended Data Fig. 9a). Top, at an example locus encompassing Tbx3, 
Tbx5, and Lhx5, up to 25 enhancers are assigned to each of these three 
genes (blue, pink and brown boxes, Extended Data Fig. 9c). Genes 
showing fewer than five assigned enhancers are shown in grey. Bottom, 
heat maps showing meta-profiles of each gene’s expression profile 
across tissues (red shades), along with the cumulative activity profile of 
its assigned enhancers (blue shades). b, Distribution of the number of 
enhancers assigned to developmental transcription factors (TFs) with 
biased expression in limb (P=5 x 10~'° versus housekeeping), forebrain 
(P=8 x 1071), and heart (P=3 x 10°) (two-sided Mann-Whitney 
tests). Box plots show median, interquartile values, range, and outliers 
(individual points). c, Complete spectrum of genes with at least one 
assigned enhancer, sorted by decreasing enhancer number. 
Limb-biased transcription factors are highlighted in green. d, Total 
number of enhancers (in all tissues analysed) assigned to each 
transcription factor in c, with the number of assigned enhancers 
predicted specifically in limb at E11.5 (dark green) or any other stage 
analysed (light green). 
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during development, taking advantage of chromatin signatures of distal 
enhancers and gene transcription measured across multiple tissues and 
time points of mouse development (Fig. 4 and Extended Data Figs 9, 10). 
We analysed correlations between H3K27ac chromatin immunopre- 
cipitation followed by sequencing (ChIP-seq) and RNA sequencing 
(RNA-seq) datasets from twelve different mouse tissues at two or three 
embryonic or perinatal time points per tissue (https://www.encode 
project.org/) to assign each enhancer to its most likely target gene 
within the same topologically associated domain (TAD)”® (Fig. 4, 
Extended Data Fig. 9a-c and Methods). We then used this framework 
to examine the average number of enhancers associated with genes 
expressed in three developmental tissues (limb, heart, and forebrain). 
Genes with limb-biased expression showed a median of three asso- 
ciated distal enhancers, versus a median of zero for housekeeping 
genes (Extended Data Fig. 9d, e). For the specific class of limb-biased 
genes encoding transcription factors, we observed an even more com- 
plex enhancer landscape, with a median of eight distinct enhancers 
per gene (Fig. 4b). Notably, some of these transcription factor genes 
were associated with more than ten tissue-specific limb enhancers 
with highly overlapping activity patterns in the same tissue (Fig. 4c, d 
and Methods). We observed similarly large numbers of potentially 
redundant enhancers near brain- and heart-specific transcription fac- 
tor genes (Extended Data Fig. 10a, b). Even under stringent correlation 
thresholds, our analysis uncovered 1,058 genes associated with five 
or more enhancers showing putatively redundant activity patterns— 
that is, enhancers that are active in the same tissue (Extended Data 
Fig. 10c-f). These results indicate that developmentally expressed genes 
are commonly associated with multiple enhancers that show overlap- 
ping activity patterns, supporting the widespread existence of function- 
ally redundant enhancers in mammalian genomes. 

Studies of individual loci have identified examples of mammalian 
enhancers near the same gene with remarkably similar spatiotemporal 
activity patterns or functions!>?’-*?, reminiscent of invertebrate 
‘shadow enhancers®?*?-*°, The lack of marked morphological pheno- 
types in our enhancer deletion mouse models suggests that panels 
of mammalian enhancers with large degrees of redundancy act as a 
regulatory buffer for key developmental processes, thereby reducing the 
likelihood of severe consequences resulting from genetic or environ- 
mental challenges®. Although individual examples of enhancers whose 
loss leads to severe phenotypes have been described**%, our findings 
suggest that redundancy is far more common. As indicated by the 
phenotypes observed in sensitized genetic backgrounds, our results 
suggest that pairs of enhancers act redundantly in organismal pattern- 
ing, but additively in establishing gene expression levels. This obser- 
vation is consistent with high-throughput loss-of-function screens in 
cultured cells, in which the disruption of individual enhancers leads 
to measurable gene expression changes but rarely results in the com- 
plete loss of target gene expression*”. It appears plausible to assume that 
limited but specific contributions to overall gene expression levels are 
relevant for organismal fitness under specific pressures, thus subjecting 
enhancers to purifying selection over evolutionary time. Alternatively, 
additional tissue-specific functions may also explain the evolutionary 
constraints on these loci. 

Our observations have implications for the interpretation of noncoding 
regulatory variants in relation to human phenotypes. Our findings 
suggest that many loss-of-function enhancer mutations will cause, 
at most, subtle phenotypes in humans. Thus, for many genetic loci, 
enhancer-associated disease phenotypes may be more likely to result 
from gain-of-function mutations that either expand enhancer activity*® 
or alter the positions of enhancers relative to genes”. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

Experimental design. All animal work was reviewed and approved by the 
Lawrence Berkeley National Laboratory (LBNL) Animal Welfare Committee. All 
mice used in this study were housed at the Animal Care Facility (ACF) at LBNL. 
Mice were monitored daily for food and water intake, and animals were inspected 
weekly by the Chair of the Animal Welfare and Research Committee and the head 
of the animal facility in consultation with the veterinary staff. The LBNL ACF 
is accredited by the American Association for the Accreditation of Laboratory 
Animal Care (AAALAC). Transgenic mouse assays and enhancer knockouts 
were performed in Mus musculus FVB strain mice. The following developmental 
stages were used in this study: embryonic day E10.5, E11.5, E12.5 and E18.5 
mice. Animals of both sexes were used in the analysis. Sample size selection and 
randomization strategies were conducted as described below. 

Transgenic mouse assay selection and randomization. Sample sizes were selected 
empirically on the basis of our previous experience of performing transgenic mouse 
assays for more than 2,000 total putative enhancers (VISTA Enhancer Browser: 
https://enhancer.Ibl.gov/). Mouse embryos were excluded from further analysis if 
they did not contain the reporter transgene or if the developmental stage was not 
correct. All transgenic mice were treated with identical experimental conditions. 
Randomization and experimenter blinding were unnecessary and not performed. 
Enhancer knockout selection and randomization. Sample sizes were selected 
empirically on the basis of our previous studies!°. All phenotypic characterization 
of knockout mice used a matched littermate selection strategy. All phenotyped 
mice described in the paper resulted from crossing heterozygous enhancer deletion 
mice together to allow the comparison of matched littermates of different geno- 
types. Embryonic samples used for in situ hybridizations, RNA-seq, and skeletal 
preparations were dissected blinded to genotype. 

In vivo transgenic reporter assays. Enhancer names in this study are the unique 
identifiers used in the VISTA Enhancer Browser (https://enhancer.Ibl.gov/; mm: 
originally identified in mouse; hs: originally identified in human). Transgenic 
results for most enhancers have been reported previously'*-'°. Newly tested 
enhancers (hs1586 at E10.5 and hs1262) were amplified from human genomic 
DNA and cloned into an hsp68-lacZ expression vector as previously described". 
Genomic coordinates of all enhancers are listed in Supplementary Table 1. LacZ 
transgenic mouse assays were conducted as previously described!“°. To directly 
compare the activity domains between apparently redundant enhancers, enhancers 
were cloned, using Gateway (Thermo Fisher Scientific) or Gibson*! methods, into 
an hsp68-based reporter vector similar to that described above, with the exception 
of a fluorescent reporter replacing LacZ. The enhancer-reporter combinations 
were generated as follows: mm1179-sfGFP, hs1586—mCherry, hs741-sfGFP 
and hs1262—mCherry. sfGFP is a fusion of Sunl and 2x sfGFP as described” 
and localizes to the nuclear membrane. Mice carrying the individual fluorescent 
reporter transgenes were then generated via pronuclear injection (using FVB strain 
zygotes), and stable lines were established from founders showing reproducible 
reporter activity in the embryonic limb. 

Generation of enhancer knockout mice using CRISPR-Cas9. Mouse strains 
lacking limb enhancer(s) or harbouring gene loss-of-function alleles were 
generated using in vivo CRISPR-Cas9 editing, as previously described, with 
only minor modifications***. Pairs of single guide RNAs (sgRNAs) targeting 
genomic sequence 5’ and 3’ to the sequence to be deleted were designed using 
CHOPCHOP* (see Supplementary Table 1 for ssRNA sequences and coordinates 
of deleted regions). Knockout mice were engineered as described previously*® using 
a mix containing Cas9 mRNA (final concentration of 100 ng/j1l) and two sgRNAs 
(25 ng/l each) in injection buffer (10 mM Tris, pH 7.5; 0.1mM EDTA). This mix 
was injected into the cytoplasm of single-cell FVB strain mouse embryos. Founder 
(FO) mice were genotyped using PCR with High Fidelity Platinum Taq Polymerase 
(Thermo Fisher Scientific) to identify those with the desired non-homologous end 
joining (NHEJ)-generated deletion breakpoints (see Extended Data Figs la-j, 5a-c, 
7a, d and Supplementary Table 3 for genotyping strategy, primer sequences and 
PCR amplicons). Sanger sequencing was used to identify and confirm deletion 
breakpoints in FO and F1 mice (Extended Data Figs 1a-j, 5a—c, 7a, d). Unless noted 
otherwise, mice homozygous-null for the targeted limb enhancers showed normal 
pre- and postnatal viability and appeared outwardly normal. For iterative CRISPR- 
Cas9 genome editing, fertilized mouse eggs harbouring the primary deletion were 
collected and injected with sgRNAs targeting the secondary enhancer for deletion. 
Only those founder lines harbouring both deletions on the same haplotype were 
analysed further. 

In situ hybridization and skeletal preparations. To assess spatial changes in gene 
expression in mouse embryonic limbs, whole mount in situ hybridization using 
digoxigenin-labelled antisense riboprobes was carried out as previously described”. 
Forelimbs and hindlimbs from at least three independent embryos were analysed 
for each genotype (including wild-type littermate controls). Mouse embryonic 
skeletons at E18.5 were stained with Alcian blue and Alizarin red to differentiate 


cartilage (blue) and bone (red) using standard methods”. For comparison 
of limb skeletons from enhancer knockout embryos and wild-type littermates, 
general parameters such as bone number, shape, length, position or mineralization 
were assessed. Embryonic limbs and limb skeletons were imaged, and skeletal 
elements were measured, using a Leica MZ16 stereo-microscope coupled to 
a Leica DFC300Fx or DFC420 digital camera. Brightness and contrast were 
adjusted uniformly using Photoshop CS5. Measurements of the ossified portions 
of humerus and femur (stylopodial elements) were normalized to those of the ulna 
and tibia (related zeugopodial elements), respectively (as shown in Figs 2d, 3b and 
Extended Data Figs 5i, 8c). 

Quantitative real-time PCR (qPCR) and RNA-seq. RNA was isolated from 
microdissected forelimbs or hindlimbs of mouse embryos at E11.5 using the 
Ambion RNAqueous Total RNA Isolation Kit (Life Technologies) according to the 
manufacturer's instructions. For qPCR, RNA was treated with RNase-free DNase 
(Promega) and reverse transcribed using SuperScript III (Life Technologies) with 
random hexamer or poly-dT priming according to the manufacturer's instructions. 
qPCR was performed on a LightCycler 480 (Roche) using KAPA SYBR FAST qPCR 
Master Mix (Kapa Biosystems) according to the manufacturer's instructions. qPCR 
primers (listed in Supplementary Table 4) were designed in silico using Primer3 
(http://primer3.wi.mit.edu/), and amplicons span exon-exon junctions in order 
to prevent amplification of genomic DNA. Relative gene expression levels were 
calculated using the 2-44“ method**, normalized to the Actb housekeeping gene, 
and the mean of wild-type control samples was set to 1. 

For RNA-seq, RNA samples were treated with DNase (TURBO DNA-free 
Kit, Life Technologies), and RNA quality was verified using a 2100 Bioanalyzer 
(Agilent) with an RNA 6000 Nano Kit (Agilent). RNA-seq libraries were generated 
using the TruSeq Stranded mRNA Sample Prep Kit (Illumina), following the 
manufacturer's instructions, and purified, eluted, and quantified as described 
previously”? RNA-seq libraries were pooled (four per lane) and sequenced using 
single end 50-bp reads on a HiSeq 4000 (Illumina). 

Immunofluorescence. Mouse embryonic limbs at E10.5, E11.5 or E12.5 were dis- 
sected in cold PBS and fixed in 4% PFA for 2-3 h. Following incubation in a sucrose 
gradient and embedding in a 1:1 mixture of 30% sucrose and optimum cutting 
temperature solution, sagittal 10-\1m frozen sections were cut using a cryostat. 
Cryosections were incubated overnight with the following primary antibodies: 
chicken anti-GFP (1:500, Thermo Fisher Scientific, A10262), rabbit anti-mCherry 
(1:1,000, Thermo Fisher Scientific, PA5-34974) and goat anti-Sox9 (1:500, R&D 
Systems, AF3075). Goat-anti chicken, goat anti-rabbit and donkey anti-goat 
secondary antibodies conjugated to Alexa Fluor 488, 568, 594 or 647 (1:1,000, 
Thermo Fisher Scientific) were used for detection. Hoechst 33258 (Sigma-Aldrich) 
was used to counterstain nuclei. Fluorescent images were acquired using a Zeiss 
AxioImager fluorescence microscope in combination with a Hamamatsu Orca-03 
camera. Brightness and contrast were adjusted uniformly using Photoshop CS5. 
ChIP-seq. For each of six single enhancer knockout lines, ChIP-seq to H3K27ac 
was performed using a protocol optimized for mouse embryonic tissues”. In brief, 
forelimb buds from ten wild-type embryos (four biological replicates) and ten 
enhancer knockout embryos (at least two biological replicates) were dissected 
at E11.5, formaldehyde crosslinked, and sheared using a Diagenode Bioruptor 
Sonicator. After pre-clearing, chromatin was incubated with anti-H3K27ac anti- 
body (Active Motif cat no. 39133) for 2h at 4°C. Freshly rinsed Dynabeads (1:1 
protein A:protein G mix) were then added to the antibody-treated chromatin, and 
immunoprecipitation was performed on a rotator for 30 min at 4°C. Libraries were 
prepared using the Illumina Truseq DNA sample prep kit following the manufac- 
turer’s instructions with minor modifications. Library quality was assessed using 
a 2100 Bioanalyzer with the High Sensitivity DNA Kit (Agilent), and quantifica- 
tion was performed using a Qubit Fluorometer with the dsDNA HS Assay Kit 
(Life Technologies). ChIP-seq and input libraries were pooled and sequenced via 
single-end 50-bp reads on a HiSeq 2000 or 4000 (Illumina). 

RNA-seq and ChIP-seq analysis. Analysis of ChIP-seq and RNA-seq data from 
limb enhancer knockout and related wild-type control samples was performed 
as follows: CASAVA v1.8.0 (Illumina) was used to demultiplex data, and reads 
with CASAVA ‘Y flag (purity filtering) were discarded. For each sample, between 
12 million and 55 million (ChIP-seq) or 23 million and 71 million (RNA-seq) 
reads were obtained following quality filtering and adaptor trimming using 
cutadapt_v1.1 (https://cutadapt.readthedocs.io/) with parameter ‘-m 25 -q 25°. 
Mouse genome sequence (mm49) and gene annotations were retrieved from the 
iGenomes repository (https://support.illumina.com/sequencing/sequencing_ 
software/igenome.html). 

To align the RNA-seq reads to the mouse reference genome and transcrip- 
tome, we used Tophat v2.0.6°!, and the reads mapping to UCSC known genes 
were counted by HTSeq”’. Genes with counts per million (CPM) >1 in at least two 
samples were processed for further differential gene expression analysis comparing 
enhancer knockout and wild-type control samples using edgeR™. In each case, the 
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top 100 differentially expressed genes, sorted by false discovery rate (FDR), are 
listed in Supplementary Tables 5-7. 

For read mapping and peak calling of ChIP-seq datasets, bowtie™ (version 
0.12.8) with parameter ‘-m 1 -v 2’ and MACS* (version 1.4.2) with parameter 
“-mfold = 10,30 -nomodel -p 0.0001’ were used, respectively. Biological replicates 
were combined using MSPC*, with the following parameters: -r biological -s 1E-10 
-W 1E-6 -m Highest -c 2. The predicted enhancer intervals were assigned the best 
P-value (as defined by MACS*) among the overlapping peaks. 

ENCODE ChIP-seq data analysis. Raw data were downloaded from the Data 
Coordination Center of the ENCODE project (https://www.encodeproject.org/, 
see Supplementary Table 8 for the complete list of sample identifiers). Short reads 
were aligned to the mm10 assembly of the mouse genome using bowtie™, with 
the following parameters: -a -m 1 -n 2 -| 32 -e 3001. Peak calling was per- 
formed using MACS v1.4*°, with the following arguments: -gsize = mm-bw= 
300-nomodel-shiftsize = 100. Experiment-matched input DNA was used as a 
control. 

ENCODE RNA-seq data analysis. Raw data were downloaded from the ENCODE 
Data Coordination Center (https://www.encodeproject.org/, see Supplementary 
Table 8 for the complete list of sample identifiers). Short reads were aligned to the 
mm10 assembly of the mouse genome using Tophat v2.0.8°” and Gencode vM3** 
as the reference transcriptome. Cuffnorm v2.2.1°! was run to quantify transcripts 
across conditions using the Gencode vM3** transcriptome as the reference and 
setting -library-norm-method to geometric. Only genes with a level of expression 
of at least one RPKM (reads per kilobase of exons per million mapped reads) in at 
least one of the considered conditions were included in further analyses. Small and 
non-coding RNAs were excluded by retaining only those genes with a Gencode 
biotype** supporting protein-coding functionality. 

Classifying genes by tissue-biased patterns of expression. For each protein- 
coding gene in the mouse genome, the expression variability across the twenty-nine 
ENCODE RNA-seq experiments from multiple tissues and developmental time 
points was evaluated using two metrics: a measure of tissue-specificity (7)°? 
ranging from 0 (consistent expression across all conditions) to 1 (expression in 
one single condition); and a measure of relative expression in a condition of interest 
(for example, limb at E11.5). Given a gene, the latter was defined as the difference 
between the percentile of expression of the gene in the given condition and the 
median percentile of expression across all the samples. A large positive number 
indicates a gene that is much more expressed in the condition of interest than the 
average. 

Tissue-biased genes were defined as showing T > 0.7 and relative expression 
higher than the 95th percentile. Housekeeping genes were defined as having T< 0.4 
and relative expression between the 5th and 95th percentiles. The complete lists of 
genes assigned to each category are available in Supplementary Table 9. 

Gene classification based on pre-specified functional categories. Tissue-biased 
developmental transcription factors (sometimes referred to as tissue-specific tran- 
scription factors) were defined as genes with biased expression in a given tissue (see 
previous section), associated with abnormal developmental phenotypes in the same 
tissue (terms extracted from the Mouse Genome Informatics (MGI) database, 
listed in Supplementary Table 10) and annotated as a transcription factor under 
the terms GO:0003700 or GO:0003705 in the Gene Ontology (GO)*!. Annotations 
were downloaded from GO and MGI on July 7, 2016. 

Topologically associated domains. TAD coordinates” estimated from mouse 
embryonic stem cell Hi-C data were downloaded from http://chromosome.sdsc. 
edu/mouse/hi-c/download.html. Coordinates were converted from mm9 to mm10 
using liftOver™. 

A statistical framework defining enhancer-promoter associations genome- 
wide. A list of putative enhancer regions was first defined as follows: after excluding 
any region annotated to the mitochondrial or any random chromosome, the BED 
coordinates of the H3K27ac peaks across the twenty-nine conditions (different 
combinations of tissue and developmental stage as defined by the ENCODE 
consortium, see ‘ENCODE ChIP-seq data analysis’ above) were merged using 
the mergeBed utility from BEDTools v.2.17.0%. For a more robust signal estima- 
tion (see below), regions shorter than 500 bp were enlarged to 1 kb from their 
central coordinate. Promoters, defined as regions within 2.5 kb of the transcrip- 
tional start sites of genes annotated in Gencode vM3**, were then excluded using 
subtractBed from BEDTools v.2.17.0°. After that, any remaining region shorter 
than 1 kb was excluded. Uniquely aligned, de-duplicated reads were then used to 
quantify the H3K27ac signals at each region, for each one of the 29 conditions. 
These signals were measured using the coverageBed utility from BEDTools 
v.2.17.0%, normalized to RPKM (according to the sequencing depth of each specific 
sample), and log,-transformed. The resulting list of 74,366 predicted enhancers 
and their corresponding H3K27ac signal quantifications, along with the mRNA 
expression measurements for the protein-coding genes (as defined in ‘Classifying 
genes by tissue-biased patterns of expressiom), were used as input for the statistical 
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framework described below. The main steps of the approach are also outlined in 
Extended Data Fig. 9b. 

For each previously defined TAD in the mouse genome”, we retrieved all of 
the enhancers predicted and the genes expressed in at least one of the twenty-nine 
conditions considered that fell within that TAD. Pairwise correlations between 
all possible enhancer-gene combinations within the TAD were then evaluated by 
calculating Spearman's rank correlation coefficient (SCC) between the H3K27ac 
pattern of enrichment at the enhancer and the mRNA expression of the gene 
across the conditions. Each putative enhancer was initially assigned to the gene 
showing the highest SCC value (in the very rare case of ties, all of the genes 
showing the same SCC value were assigned to the enhancer). After that, a null 
distribution of SCC values was estimated empirically, by pairing the enhancer 
with 1,000 randomly picked genes from the same chromosome. The z-score 
for the correlation coefficient was then calculated by subtracting the mean and 
dividing by the standard deviation estimated from the empirical null. The corre- 
sponding P value was calculated using the pnorm function in R. Finally, only those 
putative enhancers showing a P value < 0.05 and a SCC > =0.25 were retained, 
resulting in a set of 34,882 enhancers with an assigned target (Supplementary 
Table 11). Considering the entire, genome-wide set of pairwise associations, a 
P=0.05 corresponds to a Benjamini-Hochberg corrected FDR of 0.087. This 
analysis resulted in the assignment of one or more putative enhancers to 9,365 
protein-coding genes (Supplementary Table 12). To define a set of genes with 
many redundant enhancers, we considered enhancers as redundant only if they 
were associated with the same gene by correlation and showed a strong peak of 
H3K27ac in the same exact tissue under examination (for example, both enhancers 
are active in limb and linked to the Gli3 gene). Although this correlative approach 
may result in a subset of false-positive assignments for individual genes, it enables 
an approximation of both regulatory complexity and potential enhancer redun- 
dancy across the entire genome. We found 1,276 genes that showed multiple 
assigned enhancers such that at least five of the enhancers were all active in the 
same tissue (limb, heart or brain). We then used a permutation scheme to directly 
evaluate the statistical robustness of this conclusion (that is, 1,276 genes with 5 or 
more redundant enhancers in either developing limbs, heart or forebrain), which 
considered increasingly higher correlation values between the activity of putative 
enhancers and expression of genes (Extended Data Fig. 10c-f). By re-shuffling 
the expression values of each gene across conditions (100 genome-wide permu- 
tations), we estimated the FDR of observing a gene with five or more enhancers 
attached to it, for increasingly larger correlation coefficients. Each permutation 
consisted of the same enhancers and genes, in which the H3K27ac values were 
left as in the actual data whereas the RNA expression values of the genes across 
the different samples were randomly reshuffled. For each genome-wide permuted 
matrix, the entire statistical approach described above was re-run and a map of 
enhancer—promoter associations was generated. For each value of Spearman's cor- 
relation coefficient (0.25 to 0.75, with a 0.01 step) the number of genes showing five 
or more enhancers in the permuted data was calculated. The average across the 100 
iterations was then computed and used for FDR estimation. This was calculated as 
the average number of genes showing five or more enhancers across the permuted 
data, over the number of genes derived from the actual data. 

Statistical analysis. Statistical analyses are described in detail in the Methods 
sections above. Whenever a P value is reported in the text, the statistical test is also 
indicated. Unless specified otherwise, all the statistics were estimated and plots 
drawn using the statistical computing environment R (https://www.r-project.org) 
or GraphPad Prism 7 software. 

Data availability. ChIP-seq and RNA-seq datasets are available in the NCBI 
GEO database with the accession code GSE93730. Additional data supporting the 
findings of this study are available from the corresponding authors upon reasonable 
request. 
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Extended Data Figure 1 | See next page for caption. 
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Extended Data Figure 1 | CRISPR deletion of ten limb enhancers 

and regulatory interaction landscape of associated target genes. 

a-j, Left, representative activity patterns of the selected enhancers in 
mouse embryos at E11.5 (VISTA enhancer browser)'* and the respective 
genomic enhancer regions tested in transgenic assays (Tg, blue bar), 
along with the regions deleted in enhancer knockout mice (Del, red 

bar). Corresponding H3K27 acetylation patterns (green) in wild-type 
mouse embryonic forelimbs at E11.5 (this study) are depicted with open 
chromatin (ENCODE DHS in forelimbs at E11.5, purple) and the Placental 
Mammal basewise conservation track by PhyloP (Cons, blue/red). Scale 
bars, 500 bp. VISTA enhancer IDs (mm and hs numbers) are indicated on 
the left, with the distance of the enhancer from the transcriptional start 
site of the predicted target gene in the mouse genome. Numbers at the 
bottom right of each embryo indicate the reproducibility of the enhancer 
reporter assay. Arrowheads mark additional activity domains (other than 
limb): hs1262 (hindbrain, reproducibility: 5/6, also shown previously'’), 
mm917 (dorsal root ganglion, 7/7) and hs1603 (nose, 7/7; and branchial 
arch, 5/7). Asterisk indicates potential craniofacial enhancer activity for 
mm636, which was observed in 3 of 9 embryos™. Right, PCR validation 
strategy and results for enhancer knockout lines. Red scissors indicate 


CRISPR-mediated deletion breakpoints. PCR was used to detect the 
wild-type (+) and enhancer deletion (A) alleles. Below, Sanger sequencing 
traces show the deletion breakpoints (indicated by the dashed line) for 

the enhancer knockout alleles. PCR genotyping results are shown with 
amplicon sizes indicated on the left (enhancer deletion allele in red). 
Primers (Ctrl or Ctrl2) amplifying an unrelated genomic region were 
included as a PCR positive control. See Supplementary Table 3 for all primer 
sequences and related PCR product sizes. k, Top, Hi-C interaction heat 
maps of topologically associated chromatin domains (mouse embryonic 
stem cell TADs)”°. Bottom, selected enhancers (blue triangles) and their 
predicted target genes (TSS indicated as black bar). The Capture-C UCSC 
browser track (purple) illustrates three-dimensional chromatin interaction 
profiles from E11.5 embryonic limbs (3-kb window) using promoters 

of the predicted enhancer target genes as anchor points”. H3K27ac 
enrichment (green) in wild-type forelimbs at E11.5 (this study) is shown 
below. Six of the ten enhancers selected for deletion analysis display 

local Capture-C enrichment (*), indicating physical interaction with the 
predicted target gene promoter at E10.5 or E11.5, based on the stringent 
statistical approach (95th percentile threshold) applied in the original 
study~*. Other genes present in the TAD are shown in grey. 
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Extended Data Figure 2 | No major differences in expression of 
predicted target genes in individual enhancer knockouts. a, Spatial 


in wild type). Transcript distribution was reproduced in at least n= 3 
independent biological replicates. b, Quantitative real-time PCR using 


enhancer activity domains (LacZ, see also Fig. 1b) are compared to mRNA _ limbs of homozygous null (KO, red dots) and wild-type (Wt, blue dots) 
expression domains (by in situ hybridization) of the predicted target genes embryos at E11.5 reveals lack of significantly downregulated transcript 


in embryonic forelimbs and hindlimbs at E11.5. No significant changes 
in expression patterns were observed in enhancer knockouts compared 


levels of predicted enhancer target genes in nine out of ten cases. Box 
to plots indicate median, interquartile values, range and individual biological 


wild-type limbs, except in limbs lacking hs741, where a small subdomain replicates. Outliers are shown as circled data points. **P=0.0012, 


of target gene expression was lost (red arrowhead marks loss of the 


unpaired, two-tailed t-test. n.s., not significant. Scale bars, 100,1m. 


posterior Shox2 domain in the distal limb, compared with black arrowhead 
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Extended Data Figure 3 | Absence of obvious morphological differences in comparison to wild-type littermates. Cartilage is stained 
abnormalities in limb enhancer knockouts. Side-by-side comparison blue and bone dark red. The number of embryos with normal limb 
of enhancer knockout limb skeletons and wild-type littermate controls phenotypes over the total number of homozygous-null embryos examined 
at E18.5. Neither forelimbs (this figure) nor hindlimbs (data not shown) is shown in the bottom left. n represents number of independent biological 
of the enhancer knockout lines revealed any obvious morphological replicates with similar results. Scale bar, 1 mm. 
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Extended Data Figure 4 | Absence of compensatory enhancer signatures 
in limbs of enhancer knockout embryos. a, Layered ChIP-seq H3K27 
acetylation (ac) profiles surrounding the deleted enhancers and from 
wild-type (blue, n= 4 independent biological replicates) and enhancer 
knockout embryos (orange, at least n = 2 biological replicates). For all 
samples, E11.5 forelimb was profiled. For display, replicates were merged 
using bigWigMerge (UCSC tools) and normalized. Red triangles indicate 


the positions of individual enhancer deletions. b, H3K27ac enrichments 
in targeted regions marked by red triangles in a, showing the absence of 
H3K27ac at the deletion site in individual enhancer knockout (orange) 
compared to wild-type (blue) samples. Blue bars indicate locations of 
enhancer sequences. Dashed red lines demarcate the regions deleted by 
CRISPR. Vertebrate basewise conservation track by PhyloP (Cons) is 
shown. 
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Extended Data Figure 5 | Transcriptional and phenotypic impact of 
dual enhancer deletions engineered by iterative CRISPR-Cas9 genome 
editing. a-c, Top, enhancer pairs with overlapping limb activities (LacZ), 
coinciding with domains of predicted target gene expression visualized by 
in situ hybridization (ISH). For Sox9 enhancers, black arrowheads indicate 
overlapping domains. Schematics, double enhancer deletion strategy to 
delete the three enhancer pairs with overlapping activity (see Methods). 
Grey numbers indicate enhancer distance (kb) from the TSS. Bottom, 
Sanger sequencing verification of the secondary enhancer deletion. 
Deletion breakpoint is marked by the dashed line. Grey horizontal bars 
indicate bases present in the primary deletions (single enhancer knockout 
lines, see Extended Data Fig. la—j). Shox2- and Sox9-associated LacZ 
panels are also used in Extended Data Fig. 2. d, Gli3 transcript distribution 
in situ hybridization in wild-type (Wt) and mm1179/hs1586 DKO 
embryos. Arrowhead points to reduced Gli3 transcript in the anterior limb 
mesenchyme. Dashed line indicates dissected hand plate for RNA-seq. 

e, RNA-seq confirmed significantly reduced Gli3 expression in hand 
plates of DKO embryos but not individual enhancer knockout embryos 
(compared to wild-type hand plates). f, Unaffected hindlimb morphology 
in mm1179/hs1586 DKO embryos. Red arrowhead points to digit 1 
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duplication in forelimbs (see also Fig. 2). g, Shox2 expression (in situ 
hybridization) in forelimbs and hindlimbs of hs741/hs1262 DKO 
embryos. The distal-posterior domain (arrowhead) is dependent on hs741 
(Extended Data Fig. 2a). h, Reduced Shox2 expression in forelimbs and 
hindlimbs of hs741/hs1262 DKO embryos (qPCR). Expression of the 
nearby Rsrc1 gene was unchanged. i, Left, representative limb skeletons 
of wild-type and hs741/hs1262 DKO embryos. Hu, humerus; Ul, ulna; 

Fe, femur; Ti, tibia. Right, mild but significant reduction in humerus 
ossification length (double arrows) in hs741/hs1262 DKO limb skeletons. 
*** P— 1.66 x 10-7 (two-tailed, unpaired t-test). j, Absence of evident 
differences in Sox9 expression or skeletal abnormalities in embryos 
lacking both the hs1467 and mm636 enhancers near Sox9. For in situ 
hybridization, transcript distribution was reproduced in at least n= 3 
independent biological replicates. n represents number of independent 
biological replicates with similar results. For bar graphs and boxplots, 
individual biological replicates are shown as data points. Bar graphs 
illustrate mean and s.d. Box plot indicates median, interquartile values 
and range. ***P < 0.001; **P < 0.01 (two-tailed, unpaired t-test). n.s., not 
significant. Scale bars, 100 1m (white) and 500m (black). 
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Extended Data Figure 6 | Cellular resolution of redundant Gli3 plates of double enhancer transgenic embryos. Close-ups (right) show that 
enhancer activities at the onset of digit formation. a, b, Individual Gli3 the anterior mesenchyme (Fig. 2c) harbours many cells with dual enhancer 
enhancer activities as detected by immunofluorescence (mm1179, green; activities (yellow). A fraction of double enhancer-positive cells carries the 
hs1586, red) in forelimbs of transgenic reporter embryos. Sox9 (grey) signature of Sox9 digit progenitors (white, bottom). n = 3 independent 
marks chondrogenic progenitors of the mesenchymal condensations embryos per genotype were analysed, with similar results. Nuclei, detected 
forming digit primordia (digits 1-5, from anterior to posterior). via Hoechst staining, are blue. Scale bars, 100 1m (a, b); 50 jum (c, d). 


c, d, Co-localization of mm1179 and hs1586 enhancer activities in hand 
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Extended Data Figure 7 | Generation of Gli3 and Shox2 knockout alleles 
and characterization of enhancer deletions in a sensitized background. 
a, d, Top, schematic showing CRISPR-Cas9-mediated deletions used to 
generate Gli3 and Shox2 loss-of-function alleles. Genotyping primers 

used to validate targeted deletion events are indicated. Bottom, Sanger 
sequencing confirmation of deletion event, with grey and red dashed lines 
indicating breakpoints. Right, PCR genotyping examples with the size of 
the product specific for the deletion allele depicted in red (primers listed 
in Supplementary Table 3). b, In situ hybridization showing the gradual 
decrease in anterior Gli3 transcript in forelimbs of wild-type, Gli34/+ 

and sensitized mm1179/hs1586 DKO (DKO/GIi34) embryos. c, qPCR 
validation of Gli3 mRNA levels in forelimb hand plates from the genotypes 
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shown in b. e, Shox2 expression (in situ hybridization) in forelimbs and 
hindlimbs of wild-type, Shox24’* and sensitized hs741/hs1262 DKO 
(DKO/Shox2*) embryos. Arrowheads point to the domains where Shox2 
expression is nearly abolished in enhancer DKO/Shox24 embryos. f, qPCR 
revealing significantly downregulated Shox2 mRNA levels in hindlimbs of 
DKO/Shox2“ compared to Shox2“/* embryos. n indicates the number of 
independent biological replicates with similar results. Bar plots illustrate 
mean and s.d., with individual biological replicates shown. ***P < 0.001; 
*P < 0.05 (two-tailed, unpaired t-test). n.s., not significant. For in situ 
hybridization, transcript distribution was reproduced in at least n = 3 
independent biological replicates. Scale bars, 100 |1m. 
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Extended Data Figure 8 | See next page for caption. 
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Extended Data Figure 8 | Limb phenotypes of individual and 
combinatorial Gli3 and Shox2 enhancer knockouts in the presence 

of reduced target gene dosage. a, Skeletal phenotypes resulting from 
mm1179 and hs1586 enhancer deletions in combination with reduction to 
one copy of the Gli3 gene at E18.5. Genotypes are shown on the left with 
red crosses indicating elements deleted by CRISPR-Cas9. While forelimbs 
of Gli3~/+ embryos displayed bifurcated digit 1 terminal phalanges®, 
hindlimbs showed an extra toe structure but without detectable cartilage 
template. Four out of seven mm11794/Gli34 embryos displayed additional 
bifurcation of digit 2 of the right forelimb (a), which suggests that removal 
of mm1179 reduces Gli3 levels in the anterior forelimb more than deletion 
of hs1586. An almost complete anterior extra toe formed in hindlimbs 

of embryos with single or dual enhancer deletions in the sensitized 
background (black asterisks). Loss of both Gli3 copies resulted in anterior 
hindlimb polydactyly with altered digit identities (red asterisks)™*. 


b, Allelic series depicting shortening of the stylopod (humerus and femur) 
in limb skeletons with individual or combined hs741 and hs1262 enhancer 
deletions in a Shox2 sensitized condition (see also Fig. 3b). Stylopod 
ossification length (double arrows) appears less reduced in forelimbs 
(humerus, Hu) than in hindlimbs (femur, Fe) of embryos lacking the 
activity of both enhancers (hs7414, hs12624/Shox24). Tibia (Ti) and ulna 
(Ul) were normal in all genotypes examined. c, Humerus ossification 
length (normalized to ulna ossification length) is significantly reduced in 
embryos lacking either hs741 or hs1262 in the presence of only one copy 
of Shox2. In embryos lacking both enhancers in the sensitized background, 
significant shortening of humerus ossification is observed (compared to 
all other genotypes). n indicates the number of independent biological 
replicates with similar results. Box plots indicate median, interquartile 
values, range and individual biological replicates. ***P < 0.001; *P<0.05 
(two-tailed, unpaired t-test). Scale bars, 500 1m. 


© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


LETTER 


<7 > 116 C Tbx3 OE mo 
Hi-C interaction counts 


a Assigning individual 
enhancers to genes: 


chr5:118840650-4 18842565 
cht 8876874 18877874 


cht 
chr5:119478899-4 19481805 
chi 95069784 arena 


1 Mb} 
Vv 


0 i 
Med131 


Putative Enhancers 
(all tissues) 


LL VMN Uh 


| Rab79 || LhxS} 
Sasi| 


Predicted enhancers 


Match each 
enhancer to 
gene 
expression 
profile o a ni - 
‘_____» Enhancer #3 | aol Hl a 
Tors eee =082 | TbxS 
TbxS | | 
Rbmi9 
Med13I 


Sds/ E | i | 
Lhxd | El foaanal 


Al chrs:4 118749992- 120469991 


CT i a ce: 1201013744 20104045 


l ae 119051795-4 19052795 


98371624 19845875 
98554884 19858731 
9861078 4 19862078 
9875257-4 19876257 
9876809 19878263 
oe: 120005180-4 20007138 


Heart PO 

Liver PO 

Lung E14.5 
Lung PO 
Midbrain E11.5 


Hindbrain E11.5 
Kidney PO 


Intestine PO 
Kidney E14.5 
Limb E11.5 
Limb E14.5 
Liver E11.5 
Liver 14.5 


Forebrain PO 
Heart E11.5 
Heart E14.5 
Hindbrain PO 
Intestine E14.5 
Midbrain PO 
Neural.tube E11.5 

Stomach PO 


Lhx5 com om 


Midbrain E14.5 


ForebrainE11.5 | 
Forebrain E14.5 
Hindbrain E14.5 
Stomach E14.5 


20171745-4 20176397 
101997534 20201571 
020551141 20207529 
021 0000-4 20211695 
1021 2242-4 20213472 
0217009 20218657 
102249344 20225934 


Craniofacial E14.5 
Neural.tube E14.5 


Craniofacial E11.5 


ty 


Given a TSS-distal H3K27ac region (putative enhancer) in at |east 1/29 conditions 


( Given its TAD [ Given its chromosome 


Compare to mRNA 
expression of all Gencode3 
genes 


0300327~4 20304476 
10321644—4 20323565 
03647734 20367191 

103708774 20372249 
03931044 20394662 
04342694 20441143 
one: 120453947-4 20455183 


expression of all 
Gencode3 genes rl 


H3K27ac —mRNA 
correlation across conditions 


\A4 p-value estimated by 
region-specific null 
os 


| Compare to mRNA a 


Null estimated by 
pairing to 1,000 
genes randomly 


= mRNA (RPKM) 0 | ll 20 


# enhancers/promoter H3K27ac (log-RPKM) -1 +1 


Heart PO 


Hindbrain 11.5 
Liver PO 

Lung E14.5 
Lung PO 


Midbrain £11.5 


Heart 11.5 
Heart E14.5 
Intestine PO 
Kidney 14.5 
Kidney PO 
Limb E11.5 
Limb E14.5 
Liver £11.5 
Liver E14.5 
Stomach PO 


Forebrain PO 
Hindbrain PO 


Intestine E14.5 


Craniofacial E11.5 
Craniofacial 14.5 
Forebrain E11.5 
Forebrain £14.5 
Hindbrain £14.5 
Neural.tube E11.5 
Neural.tube E14.5 
Stomach E14.5 


p-value <= 0.05 
Spearman's correlation >= 0.25 


— 
wo 

aes Housekeeping — Housekeeping ao Housekeeping 
tec 8 4 Bw Limb-biased w “ & — G Forebrain-biased ue — 8 7 G Heart-biased 
o= ow -——_ ot 
Qu ty Me fF owe? 
=a! ZB eo, eee 
o = 2 oO ov 
Ce 27 ®5°7 o=°7 

Cc 
as. ae of. 
oc &4 o € §- Oe 8G 
So! Ze 29. 
§ B?4 SS ¢- ® BF] 
® oO ® 0 oO * 
eo, eo F- m5 F4 

o 
oO T T T T T = T T T T T 3 T T T T T 
0.2 04 0.6 0.8 1.0 co) 0.2 0.4 0.6 0.8 1.0 0.2 04 0.6 0.8 1.0 
Tissue specificity (Tau) Tissue specificity (Tau) Tissue specificity (Tau) 


Housekeeping []1eccccc0c0 ©0 00 0 
Limb-biased 4000000 00 0 0 00 00 00 
Forebrain-biased 40000000000 00 «=o. © 000 0 
Heart-biased 40000000 00 0 0 00 00 000 


0 10 20 30 40 50 
# enhancers / gene 


Extended Data Figure 9 | See next page for caption. 
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Extended Data Figure 9 | A correlative framework to define enhancer- 
promoter associations across the mouse genome. a, The TAD including 
the transcriptional regulators Tbx3, Tbx5 and Lhx5 illustrates the 
statistical framework to define enhancer-promoter associations genome- 
wide. For each predicted enhancer, correlation between its H3K27ac 
signal (blue arrowhead, blue-shaded heat map) with the mRNA expression 
profiles of every gene in the TAD (red-shaded heat map) across all 
available tissues and developmental stages was assessed. The enhancer 
was then assigned to the most highly correlated gene, Tbx3 in the case of 
enhancer 3. b, Schematic depicting the underlying statistical framework 
used to determine genome-wide enhancer-promoter interactions (see 
Methods). c, Activity pattern for the enhancers assigned to Tbx3, Tbx5 
and Lhx5. Genomic coordinates are listed on the right. For each predicted 
enhancer-gene pair, Spearman's correlation coefficient (SCC, n = 29) and 
the corresponding empirically estimated P value (from 1,000 random 
enhancer-gene pairings) are shown in Supplementary Table 11. 


d, Identifying genes with biased expression in embryonic limb, forebrain, 
or heart. Expression variability across 29 RNA-seq datasets from multiple 
tissues and developmental time points, measures of tissue specificity 

(Tau (7), x-axis) and specific tissue-biased expression at E11.5 (y-axis) for 
each protein-coding gene were calculated (see Methods). Housekeeping 
genes were defined as displaying 7 < 0.4 and relative expression in the 
limb between the 5th and 95th percentiles. Tissue-biased genes were 
defined as showing 7 > 0.7 and relative expression higher than the 95th 
percentile. d, Distribution of enhancer numbers assigned to each gene, 

for the different gene categories. Genes with tissue-biased expression 
profiles were associated with a significantly higher number of enhancers 
than housekeeping genes. P=4 x 107 '?! (n=553), P=7 x 10°” (n=626) 
and P=6 x 10 * (n=826) for limb, forebrain and heart biased genes, 
respectively (two-sided Mann-Whitney tests). n = 1,287 for housekeeping 
genes. Box plots indicate median, interquartile values and range. Outliers 
are shown as individual points. 
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Extended Data Figure 10 | See next page for caption. 
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Extended Data Figure 10 | Enhancer redundancy as a widespread 
feature of developmental genes and robustness to the choice of 
thresholds used in the correlative approach. a, b, Top, number of 
enhancers assigned to each gene through the correlative framework, with 
developmental transcription factors (TFs) showing biased expression in 
forebrain (a, blue dots) or heart (b, orange dots) indicated. Classification 
of tissue-biased developmental transcription factors is described in 
Methods. Genes with at least one assigned enhancer are displayed and 
sorted according to the number of assigned enhancers (left to right). 
Bottom, bar plot showing the total number of enhancers assigned to 
each of the transcription factors highlighted in the top panels. For each 
gene, a colour code shows the number of predicted enhancers assigned 
to that gene in the relevant tissue (a, heart; b, forebrain) at E11.5 (dark 
colour), in the relevant tissue at any other developmental stage included 
in the analysis (light colour), or in any other tissue (white). c, Estimated 


FDR (based on genome-wide permutations, see Methods) of observing 

a gene with five or more enhancers assigned to it, for increasingly larger 
correlation coefficients (0.25 to 0.75). The red solid line indicates an 

FDR of 0.05. The red arrow and the black dashed line highlight the 

lowest correlation coefficient (0.47, considering a step of 0.01) with an 
FDR < 0.05 (FDR= 0.0495). d, Number of genes showing five or more 
enhancers assigned to them, for increasingly larger correlation coefficients 
(0.25 to 0.75). The total number of genes (SCC > 0.25) along with the 
number of genes identified using the threshold set in ¢ (SCC > =0.47) is 
indicated (1,276 and 1,058, respectively; see Supplementary Tables 11, 12). 
e, Bubble plot showing the number of genes with five or more enhancers 
assigned to them, at increasingly higher correlation between enhancer 
and target gene expression (x-axis) and between enhancers assigned to the 
same gene (y-axis). f, Bubble plot displaying the fold-enrichment (linear) 
for developmental transcription factor genes among each set in c. 
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Dopamine neuron activity before action initiation 
gates and invigorates future movements 


Joaquim Alves da Silva! ?, Fatuel Tecuapetla!*, Vitor Paixdo! & Rui M. Costa!*4 


Deciding when and whether to move is critical for survival. Loss of 
dopamine neurons (DANs) of the substantia nigra pars compacta 
(SNc) in patients with Parkinson’s disease causes deficits in 
movement initiation and slowness of movement!. The role of DANs 
in self-paced movement has mostly been attributed to their tonic 
activity, whereas phasic changes in DAN activity have been linked 
to reward prediction”. This model has recently been challenged by 
studies showing transient changes in DAN activity before or during 
self-paced movement initiation*”. Nevertheless, the necessity of 
this activity for spontaneous movement initiation has not been 
demonstrated, nor has its relation to initiation versus ongoing 
movement been described. Here we show that a large proportion of 
SNc DANs, which did not overlap with reward-responsive DANs, 
transiently increased their activity before self-paced movement 
initiation in mice. This activity was not action-specific, and was 
related to the vigour of future movements. Inhibition of DANs when 
mice were immobile reduced the probability and vigour of future 
movements. Conversely, brief activation of DANs when mice were 
immobile increased the probability and vigour of future movements. 
Manipulations of dopamine activity after movement initiation did 
not affect ongoing movements. Similar findings were observed for 
the initiation and execution of learned action sequences. These 
findings causally implicate DAN activity before movement initiation 
in the probability and vigour of future movements. 

We quantified spontaneous motion of mice placed in an open field 
(with no external cues, food deprivation or reward) by using inertial 
sensors that measure high-resolution acceleration and angular velocity 
(Fig. 1a). We found that the distribution of movements in an open field 
was not a continuum between arrest and motion, but rather a bimodal 
distribution (Fig. 1b and Extended Data Fig. 1). Using the minimum 
acceleration between the two peaks of the bimodal distribution 
(Fig. 1b), we could separate periods of immobility from periods of 
overt mobility (Fig. 1b and Extended Data Fig. 1), and precisely identify 
moments of spontaneous movement initiation. 

We investigated the activity of photoidentified SNc DANs in relation to 
movement initiation by implanting 16-channel movable electrode bun- 
dles coupled to a fibre-optic cannula placed just above the SNc (Fig. 1a). 
We implanted these bundles in six TH-Cre mice® (Extended Data 
Fig. 2a—c) crossed with Ai32 mice’ to obtain expression of channel- 
rhodopsin-2 (ChR2) in TH* neurons, and used photoidentification'® 
to identify putative DANs (Extended Data Fig. 3, see Methods for 
details). Next, we built peri-event time histograms (PETH) of their 
activity aligned to spontaneous movement initiations. We found that 
the average activity of all recorded DANs increased transiently before 
movement initiation (Fig. 1d). Consistently, we found that many 
photoidentified DANs were significantly modulated by movement 
initiation and that the majority of these were positively modulated 
(Fig. 1d). The latency for modulation preceded the initiation of 
movement for positively modulated neurons, but not for negatively 


modulated neurons (Fig. le). These findings were corroborated using 
microendoscopic calcium imaging of SNc DANs in freely moving mice 
(Extended Data Fig. 4; 22 neurons, see Methods and below for details). 

We next examined whether the transient activity of individual 
DANs before action initiation was tuned to the initiation of specific 
actions, or represented a more general signal before action initiation. 
To characterize each spontaneous movement initiation, we built tra- 
jectories in the motion-sensor space using a combination of total body 
acceleration (correlated with displacement, see Extended Data Fig. 1), 
angular velocity (direction of movement) and gravitational acceleration 
(postural changes)! (Fig. 1f). Next, we used affinity propagation’? to 
cluster similar movement initiations!? (Fig. 1g). We found that most 
DANs were broadly tuned, and were transiently active before rather 
different initiations (Fig. 1h). Furthermore, we found that initiation 
trajectories preceded by increased activity of each DAN were as variable 
as all other initiation trajectories (Fig. 1i). These data strongly indicate 
that the transient activity of individual DANs before initiation is not 
very action-specific. 

Previous studies in patients with Parkinson's disease’? and animal 
models!® have shown that dopamine depletion leads to less vigorous 
movements. We therefore investigated whether the activity of DANs 
before movement initiation encoded information about the vigour of 
the movement that was about to be initiated. We verified that overall, 
the activity of DANs 300 ms before movement initiation was signifi- 
cantly related to the vigour of future movements (measured by body 
acceleration; Fig. 1j, k). When doing per-trial analyses comparing all 
the lower vigour with the higher vigour initiations, we found that 38.5% 
of the neurons had significantly higher activity before higher vigour 
movements (Fig. 11). 

In order to test whether inhibiting DAN activity affects movement 
initiation, we expressed archaerhodopsin (ArchT)’® specifically in 
SNc neurons (AAV2/1.CAG.Flex.ArchT-GFP injected into TH-Cre 
mice, Extended Data Fig. 5). We expressed ArchT in the SNc of 11 
TH-Cre mice and YFP in the SNc of 9 TH-Cre mice (control group), 
and delivered light unpredictably for periods of 15s (Fig. 2a). Inhibition 
of SNc DANSs increased the probability of mice being immobile 
(Fig. 2b, c). This was not observed in YFP controls (Fig. 2b, c). To inves- 
tigate whether DAN inhibition mainly affected movement initiation or 
reduced ongoing movement, we investigated the effects of DAN inhi- 
bition in trials in which the mouse was immobile (for at least 300 ms), 
or mobile (for at least 300 ms), when the inhibition started (Fig. 2d-h). 

We found a significant impairment in movement initiation when 
SNc DANs were inhibited during immobility (Fig. 2g). The effect of 
inhibition was relatively rapid with a significant difference between 
light and no light after 2.4 (Fig. 2e). Consistently, we also found that 
5s of inhibition was sufficient to impair movement initiation (Extended 
Data Fig. 6). 

By contrast, there was no significant change in mean acceleration 
when inhibition happened after movement onset (Fig. 2g). There was 
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Figure 1 | Movement initiation is preceded by increased activity of 
dopamine neurons. a, Top left, photoidentification schematics. Top 

right, movable bundle electrode array with fibre-optic cannula. Bottom, 
open field set-up. The mouse brain schematic in this panel has been re- 
drawn with permission from ref. 31. b, Top, distribution of total dynamic 
acceleration in the open field (mean in blue; 3 + 2.2 one-hour open- 

field sessions per mouse, n = 6 mice). Bottom, examples of movement 
initiation events (red). Dotted line represents acceleration threshold. 

c, PETH ofa positively modulated DAN aligned to movement initiation. 
d, Left, mean trace for all neurons (n= 25, black), positively modulated 
neurons (n= 13, green) and negatively modulated neurons (n =7, red), 
aligned to movement initiation (93.45 + 35.99 (mean + s.e.m.) spontaneous 
initiations per neuron). Grey shadows denote s.e.m. Right, proportion 

of positively (+) and negatively (—) modulated and unmodulated (No) 
neurons. e, Latency of each neuron to be significantly modulated in relation 
to movement initiation (red, negatively modulated; green, positively 
modulated). f, Time series of video frames for three movement initiations 
(1.5 s duration) and their representation in the motion sensor space. 
Cumulative angular velocity is shown in degrees per second (dg s~!). 

g, Representation of the initiations of one session using t-distributed 
stochastic neighbour embedding (t-SNE) dimensionality reduction. 
Colours represent different clusters determined using affinity propagation 
clustering. h, Number of initiation clusters in which each positively 
modulated neuron was significantly activated (n = 13, 30.4 + 0.19% of 
initiations were preceded by a significant increase in neuron activity). 

i, Spread of initiations (mean distance to every other initiation) according 
to whether positively modulated neurons were active or not (not active, 
n= 8; active, n=5).j, Normalized firing rate (firing rate / firing rate of 
low acceleration trials) for positively modulated neurons of low, medium 
and high acceleration trials (1 = 13). k, Examples of a vigour-related 
neuron (left) and a neuron unrelated to vigour (right). Red, blue and black 
represent firing rates during high, medium and low acceleration initiations, 
respectively. l, Proportion of vigour-related (red) and non-related neurons 
(grey). P< 0.05, paired t-test comparing firing rate of low and high 
acceleration trials. Data are mean + s.e.m. (i, j). *P< 0.05. 


no change in the vigour of movement when SNc DANs were inhibited 
after movement initiation and mice did not stop during the inhibition 
(Fig. 2h). Furthermore, in mobile trials in which animals stopped dur- 
ing the 15s (>300 ms immobile), there was no difference in acceleration 
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between laser-on and laser-off trials before the first stop. However, 
there was clear impairment in movement initiation after the first stop 
(Fig. 2i). This was not observed in YFP controls (Fig. 2)). 

A more detailed analysis of the effects of inhibition when mice were 
immobile revealed that there was a significant decrease in the prob- 
ability of initiating movement during the 15s of inhibition (Fig. 2k). 
Furthermore, even when mice were able to initiate movements, the 
latency to initiate was significantly higher than in laser-off trials, and 
the initiated movements were less vigorous (Fig. 21, m). Taken together, 
these data indicate that DAN activity before movement initiation mod- 
ulates the probability and vigour of future movements, but activity of 
these neurons is less critical for the maintenance and vigour of ongoing 
movements. 

Next, we investigated whether brief activation of DANs, when 
animals were immobile, would be sufficient to promote movement 
initiation. We expressed ChR2 in SNc DANs using a similar Cre- 
dependent strategy (DIO-ChR2-YFP in seven mice, and control DIO- 
eYFP in five mice). Stimulation at 20 Hz for 500 ms (Fig. 3a) delivered 
when mice were immobile was sufficient to produce overt movement 
that lasted several seconds (Fig. 3b), in accordance with previous 
findings”!”'®. The same activation when mice were overtly moving 
did not significantly affect ongoing acceleration (Fig. 3b, c). 

To further corroborate this finding, we performed an online closed- 
loop experiment in which mice received stimulation if they were 
immobile for at least 900 ms (in 50% of the trials, Fig. 3d). Trials in which 
light was not delivered (50%) were used as within-animal control (laser- 
off trials). Average acceleration during the first second after the closed- 
loop trigger was higher during laser-on than during laser-off trials in 
the ChR2 group (Fig. 3e-g). Moreover, the latency to initiate movement 
when SNc DANs were briefly activated was almost three times shorter 
than in laser-off trials (Fig. 3h). The percentage of trials in which move- 
ment was initiated during the first second was also higher in laser-on 
trials (Fig. 3i). We found no evidence that this closed-loop SNc DAN 
activation had a reinforcement effect, because immobility states did 
not become more frequent in ChR2 mice (interval between immobility 
periods: ChR2, 254.5 + 116.5; YFP, 150.5+65.8s; t=1.74, P=0.12). We 
found that movements initiated during laser-on trials were more vigorous 
than movements spontaneously initiated during laser-off trials (Fig. 3), k). 
None of the described effects were found in YFP controls. 

The results presented above highlight a specific role for the transient 
activity of DANs for the gating and invigoration of self-paced move- 
ment initiation, but not for the modulation of ongoing movements. On 
the basis of these findings, one prediction would be that if individual 
spontaneous movements are chunked into a sequence of movements, 
then the activity of DANs would become preferentially active before 
sequence initiation, but not during the execution of individual elements 
within the sequence. To test this prediction, we trained mice ona self- 
paced operant task in which eight lever presses led to a 20% sucrose 
solution reward (fixed-ratio eight task (FR8)), without any explicit 
stimuli signalling the availability of reward‘ (Fig. 4a). We implanted a 
gradient index lens just above the SNc of four TH-Cre mice (Extended 
Data Fig. 7d), and injected a virus that expressed GCaMP6f” in a Cre- 
dependent manner” (AAV2/5.CAG.Flex.GCaMP6f). We then used 
a miniaturized epifluorescence microscope”! to image calcium tran- 
sients in genetically identified SNc DANs while mice were performing 
the FR8 task (Fig. 4b, c). Similarly to previous findings’, by creating 
lever press time histograms using normalized fluorescence traces 
(z score of AF), we found that the proportion of modulated neurons 
was different between press events with the highest proportion of 
neurons being modulated by the first press (Fig. 4d—f). As predicted, 
this higher proportion of neurons that were related to the first press was 
not apparent early in training, and developed with sequence learning*”” 
(Extended Data Fig. 8). 

Recent studies claimed that SNc neurons are more modulated by 
movement, whereas ventral tegmental area neurons are more modu- 
lated by reward’. However, we found that around 35% of neurons in the 
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Figure 2 | Inhibition of SNc dopamine neurons impairs movement 
initiation, but not ongoing movement. a, Schematics showing fibre 
positioning and trial structure. b, Distribution of acceleration in the 
open field during laser on and laser off for ArchT (left) and YFP (right) 
groups. The vertical dashed line denotes the acceleration threshold. 
c, Left, time spent immobile during laser-off and laser-on periods. Clear 
bars indicate laser off and filled bars indicate laser on. Right, time spent 
immobile during laser-on normalized to the baseline. n = 11 ArchT mice, 
9 YFP mice. Single asterisk above ArchT indicates significant difference 
from baseline (1). d, Heat maps of acceleration data of all trials where 
the mouse was immobile (top) or mobile (bottom) before the start of 
the trial, in laser-off (left) and laser-on (right) conditions. n= 11 ArchT 
mice. e, f, Acceleration during laser-off and laser-on trials for immobile 
trials and mobile trials. 1 =7 ArchT mice. The horizontal dotted line 
denotes the acceleration threshold. There was an interaction between 
inhibition and mobility state when inhibition started (Supplementary 


SNc responded to reward (Fig. 4e). This is almost similar to the percent- 
age of neurons modulated during sequence initiation (approximately 
40% first press neurons; Fig. 4d, e). Importantly, there was little overlap 
between reward- and first-press-modulated neurons, and the overlap 
was not significantly different than what would be expected by chance 
(Extended Data Fig. 9). These data suggest that different populations 
of SNc DANs are related to movement versus reward, but it remains to 
be determined whether these correspond to populations that project to 
the dorsal versus ventral striatum, as has previously been suggested”’. 

Next, we tested whether SNc DAN activity was necessary for 
sequence initiation. TH-Cre mice expressing ArchT (n= 11) or YFP 
(n= 9) were trained on the FR8 task for 12-14 days. Mice developed 
a structured behaviour with predictable sequence initiations and 
trajectory after reward consumption” (Fig. 4a). To inhibit SNc neurons 
before the first press of a sequence, the laser was triggered when 


Table 1). g, Left, mean values of the data plotted in e and f. Grey, laser off; 
green, laser on. Right, same data normalized (laser on / laser off). Single 
asterisk in Immobile indicates significant difference from 1. h, Mean 
acceleration per mouse for mobile trials without stops. n = 10 ArchT mice. 
i, Left, mobile trials aligned to the first stop. Right, normalized mean 
acceleration (laser on / laser off) before (—4 to —3s) and after (3-48) 

the first stop for ArchT mice. Single asterisk above ‘After stop’ indicates 
significant difference from 1. j, Same as in i, but for the YFP group. k, Left, 
mean cumulative probability of movement initiation for immobile trials. 
Right, mean probability to initiate movement. n=7 ArchT mice. 1, Mean 
acceleration for initiations that occurred during immobile trials of ArchT 
mice. Laser on, n = 16; laser off, = 30 trials. m, Normalized mean latency 
to initiate movement. n = 9 YFP mice, 11 ArchT mice. Single asterisk 
above Move indicates significant difference from 1. Data are mean +s.e.m. 
Error bars and shaded areas denote s.e.m. *P < 0.05. 


mice broke an infrared beam placed between the reward magazine 
and the lever (Fig. 4g, h), corresponding to the moment of minimal 
DAN activity (before the increase in activity of first press neurons; 
Fig. 4g). We compared a block of inhibition (laser-on block) with a 
previous block without inhibition (laser-off block) during the same 
session. Inhibition during 5s before the first lever press resulted in 
a significant increase in the latency to initiate the action sequence 
when compared to laser-off trials (Fig. 4h). Moreover, the probability 
of initiating a sequence decreased during the 5s of DAN inhibition 
(Fig. 4i). Consistent with the experiments presented above, when the 
inhibition happened after sequence initiation (triggered by the first 
press), the inter-press interval and number of presses during the 5-s 
inhibition were not altered (Fig. 4j). No effects were observed in YFP 
controls (Fig. 4h-j). These results were replicated using DAT-IRES-Cre 
mice using a different inhibitory opsin (Jaws”) when pseudorandomly 
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Figure 3 | Transient SNc dopamine neuron activation promotes 
movement initiation. a, Example of three SNc DANs (single units) 
expressing ChR2, following stimulation at 20 Hz. b, Mean acceleration 
depending on movement state before the trial. n =7 ChR2 mice; n=5 YFP 
mice. c, Mean acceleration from 0 to 1s depending on movement state 
before the trial. Left, immobile; right, mobile. 2 =7 ChR2 mice; n=5 YFP 
mice. Blue, laser on; grey, laser off. d, Closed loop set-up. e, Heat maps of 
acceleration data for all trials. Laser-trigger criteria were reached at time 0. 
White crosses indicate onset of movement. f, Mean acceleration. n =5 mice 
per group. Dark blue, ChR2 on; light blue, ChR2 off; black, YFP on; grey, 


* * 


YFP off. g, Mean acceleration from 0 to 1s. 1=5 mice per group. h, Latency 
to initiate movement. n= 5 mice per group. i, Percentage of trials with 
movement initiation between 0 and 1s. n=5 mice per group. In h, i, there 
is a significant effect for group between ChR2 and YFP. j, Mean distribution 
of acceleration during the first second after movement initiation for laser on 
(blue) and laser off (black). n= 5 ChR2 mice. k, Mean acceleration during 
the first second of movement initiation during laser-off and laser-on trials. 
n=5 ChR2 mice. Grey bars indicate laser off; blue bars indicate laser on. 


Data are mean +s.e.m. Error bars and grey-shaded areas represent s.e.m. 
*P<0.05. 
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Figure 4 | SNc dopamine neurons are transiently active at sequence 
initiation, and when inhibited, impair sequence initiation, but not 
sequence performance. a, Example of the behaviour microstructure 
during the FR8 task late in training. Red, black and blue circles indicate 
first, middle and last press, respectively. Red and blue bars indicate licks 
and head entries, respectively. Dashed line denotes reward delivery. 

b, Field of view (projection of pixel standard deviation) of a TH-Cre 
mouse expressing GCaMPéf in the SNc. Regions of interest correspond to 
traces in c. Scale bar, 20|1m. c, Example traces obtained using the CNMF-E 
algorithm during FR8 task. d, Percentage of neurons modulated by press 
events. n=4 mice. e, Venn diagram representing reward- and first press- 
related neurons. f, PETH of positively modulated neurons for each press 
event (bottom) and the corresponding heat maps (top). Grey shadow 
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denotes s.e.m. g, Activity of first lever press-responsive neurons aligned 
to the cross from the magazine to the lever, before the first lever press. 

n= 22.h, Left, latency to initiate lever press sequence. Right, normalized 
latency to sequence initiation. ArchT, n= 11 mice; YFP,n =9 mice. 

Single asterisk above ArchT indicates significant difference from 1. i, Top, 
distribution of latencies to sequence initiation. Bottom left, percentage of 
early initiations. Right, normalized percentage of early initiations (latency 
<5s). ArchT, n= 11 mice; YFP, n=9 mice. There is a significant effect 

for group between ArchT and YFP. j, Left, press rate. Light was delivered 
after the first press. Right, mean press rate normalized. ArchT, n= 11 mice; 
YFP, n=9 mice. *P < 0.05. Clear bars indicate laser-off and filled indicate 
laser-on blocks in h-j. Data are mean + s.e.m. 
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inhibiting 30% of the trials (Extended Data Fig. 10). Taken together, 
these results indicate that SNc dopamine activity before the initiation of 
the action sequence modulates the probability and latency of sequence 
initiation, but is not critical for the execution of ongoing sequences. 

Here we show that SNc DAN activity modulates self-paced 
movement initiation. Importantly, precisely timed and state- 
dependent optogenetic manipulations did not change ongoing 
movements, indicating a specific role for SNc DAN activity for 
initiation. These results were corroborated using more complex 
movement sequences. 

It has been proposed that dopamine release in the dorsal striatum 
is important for the regulation of movement vigour®!*!°, but it was 
thought that this effect was mostly due to the ongoing tonic levels 
of dopamine release. Our results indicate that the activity of DANs 
before movement onset modulates future movement vigour. This 
could explain why patients with Parkinson's disease select less vigorous 
movements to initiate. It is also in accordance with recent studies 
that have shown that activity of DAN terminals in the dorsal striatum 
preceded spontaneous movement initiation but did not precede and 
even followed acceleration bursts during ongoing movement’. Our 
results suggest that transient changes in dopamine can function as a 
fast system that acts on top of tonic release to increase the probability 
(and vigour) of initiating movements, presumably by modulating the 
excitability of striatal projection neurons**”*, which receive infor- 
mation about the movements that are ‘planned at that exact time via 
glutamatergic inputs from cortex and/or thalamus. This suggests a 
role for dopamine in gating and invigorating movements that were 
planned elsewhere?””®, and is consistent with the observation that DAN 
activity is not very action-specific. More sustained changes in DAN 
activity could represent states in which the ‘gate’ is more permissive, 
increasing the probability of action initiation during longer periods of 
time, therefore promoting movement’. This would translate into more 
movement variability with exploration of the action space, which could 
be important in situations of uncertainty or learning. 

These results highlight that approaches aimed at providing transient 
modulations of basal ganglia circuitry tied to movement initiation, 
for example, via closed-loop deep-brain stimulation” triggered by 
activity in cortical areas related to motor planning, could be beneficial 
to patients with Parkinson's disease. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 1 February 2016; accepted 11 December 2017. 
Published online 31 January 2018. 


1. Jankovic, J. Parkinson’s disease: clinical features and diagnosis. J. Neurol. 
Neurosurg. Psychiatry 79, 368-376 (2008). 

2. Niv, Y., Daw, N. D. & Dayan, P. How fast to work: response vigor, 
motivation and tonic dopamine. Adv. Neural Inf. Process. Syst. 18, 
1019-1026 (2005). 

3. Schultz, W. Multiple dopamine functions at different time courses. Annu. Rev. 
Neurosci. 30, 259-288 (2007). 

4. Jin, X. & Costa, R. M. Start/stop signals emerge in nigrostriatal circuits during 
sequence learning. Nature 466, 457-462 (2010). 

5. Syed, E. C. J. et al. Action initiation shapes mesolimbic dopamine encoding of 
future rewards. Nat. Neurosci. 19, 34-36 (2016). 

6. Dodson, P. D. et al. Representation of spontaneous movement by dopaminergic 
neurons is cell-type selective and disrupted in Parkinsonism. Proc. Natl Acad. 
Sci. USA 113, E2180-E2188 (2016). 

7. Howe, M. W. & Dombeck, D. A. Rapid signalling in distinct dopaminergic axons 
during locomotion and reward. Nature 535, 505-510 (2016). 

8. Gong, S. et al. Targeting Cre recombinase to specific neuron populations 
with bacterial artificial chromosome constructs. J. Neurosci. 27, 9817-9823 
(2007). 

9. Madisen, L. et al. A toolbox of Cre-dependent optogenetic transgenic 
mice for light-induced activation and silencing. Nat. Neurosci. 15, 793-802 
(2012). 


LETTER 


10. Lima, S. Q., Hromadka, T., Znamenskiy, P. & Zador, A. M. PINP: a new method of 
tagging neuronal populations for identification during in vivo 
electrophysiological recording. PLoS ONE 4, e6099 (2009). 

11. Klaus, A. et al. The spatiotemporal organization of the striatum encodes action 
space. Neuron 95, 1171-1180 (2017). 

12. Frey, B. J. & Dueck, D. Clustering by passing messages between data points. 
Science 315, 972-976 (2007). 

13. Van Der Maaten, L. & Hinton, G. H. Visualizing data using t-SNE. J. Mach. Learn. 
Res. 9, 2579-2605 (2008). 

14. Mazzoni, P., Hristova, A. & Krakauer, J. W. Why don’t we move faster? 
Parkinson’s disease, movement vigor, and implicit motivation. J. Neurosci. 27, 
7105-7116 (2007). 

15. Panigrahi, B. et al. Dopamine is required for the neural representation and 
control of movement vigor. Cell 162, 1418-1430 (2015). 

16. Han, X. et al. A high-light sensitivity optical neural silencer: development and 
application to optogenetic control of non-human primate cortex. Front. Syst. 
Neurosci. 5, 18 (2011). 

17. Barter, J. W. et al. Beyond reward prediction errors: the role of dopamine in 
movement kinematics. Front. Integr. Neurosci. 9, 39 (2015). 

18. Hamid, A. A. et al. Mesolimbic dopamine signals the value of work. 

Nat. Neurosci. 19, 117-126 (2016). 

19. Chen, T.-W. et a/. Ultrasensitive fluorescent proteins for imaging neuronal 
activity. Nature 499, 295-300 (2013). 

20. Atasoy, D., Aponte, Y., Su, H. H. & Sternson, S. M. A FLEX switch targets 
channelrhodopsin-2 to multiple cell types for imaging and long-range circuit 
mapping. J. Neurosci. 28, 7025-7030 (2008). 

21. Ghosh, K. K. et a/. Miniaturized integration of a fluorescence microscope. 

Nat. Methods 8, 871-878 (2011). 

22. Wassum, K. M., Ostlund, S. B. & Maidment, N. T. Phasic mesolimbic dopamine 
signaling precedes and predicts performance of a self-initiated action 
sequence task. Biol. Psychiatry 71, 846-854 (2012). 

23. Parker, N. F. et al. Reward and choice encoding in terminals of midbrain 
dopamine neurons depends on striatal target. Nat. Neurosci. 19, 845-854 
(2016). 

24. Tecuapetla, F., Jin, X., Lima, S. Q. & Costa, R. M. Complementary contributions 
of striatal projection pathways to action initiation and execution. Ce// 166, 
703-715 (2016). 

25. Chuong, A. S. et al. Noninvasive optical inhibition with a red-shifted microbial 
rhodopsin. Nat. Neurosci. 17, 1123-1129 (2014). 

26. Kravitz, A. V. et al. Regulation of Parkinsonian motor behaviours by optogenetic 
control of basal ganglia circuitry. Nature 466, 622-626 (2010). 

27. Wong, A. L., Lindquist, M. A., Haith, A. M. & Krakauer, J. W. Explicit knowledge 
enhances motor vigor and performance: motivation versus practice in 
sequence tasks. J. Neurophysiol. 114, 219-232 (2015). 

28. Thura, D. & Cisek, P. The basal ganglia do not select reach targets but control 
the urgency of commitment. Neuron 95, 1160-1170 (2017). 

29. Spielewoy, C. et a/. Behavioural disturbances associated with 
hyperdopaminergia in dopamine-transporter knockout mice. Behav. 
Pharmacol. 11, 279-290 (2000). 

30. Rosin, B. et al. Closed-loop deep brain stimulation is superior in ameliorating 
Parkinsonism. Neuron 72, 370-384 (2011). 

31. Franklin, K. B. J. & Paxinos, G. The Mouse Brain in Stereotaxic Coordinates 
(Academic, 2008). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We thank A. Vaz for mouse colony management, 

|. Vaz for the help during photoidentification experiments, L. Perry for help 

with stereological cell counts, A. Klaus, P. Zhou, L. Paninski for help with the 
application of the CNMF-E analysis, and the Champalimaud Hardware Platform 
(F. Carvalho, A. Silva, D. Bento) for help with the development of the motion 
sensors. This work was supported by fellowships from Gulbenkian Foundation 
to J.A.d.S. and Grants from Fundagao para a Ciéncia e Tecnologia, Fronteras 

de la Ciencia-CONACyT-2022 and the IN226517 DGAPA-PAPIIT-UNAM to F.T. 
and from ERA-NET, European Research Council (COG 617142), and HHMI 

(IEC 55007415) to R.M.C. 


Author Contributions J.A.d.S. and R.M.C. designed the experiments and 
analyses and wrote the paper, J.A.d.S. performed all experiments and analyses, 
FT. helped with optogenetic and recording experiments, V.P. helped with 
accelerometer experiments and accelerometer data analyses. 


Author Information Reprints and permissions information is available 
at www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the 
paper. Publisher’s note: Springer Nature remains neutral with regard 

to jurisdictional claims in published maps and institutional affiliations. 
Correspondence and requests for materials should be addressed to 
R.M.C. (rc3031@columbia.edu). 


Reviewer Information Nature thanks D. J. Surmeier and the other anonymous 
reviewer(s) for their contribution to the peer review of this work. 


00 MONTH 2018 | VOL 000 | NATURE | 5 


© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


LETTER 


METHODS 


Animals. All experiments were approved by the Portuguese DGAV and 
Champalimaud Centre for the Unknown Ethical Committee and performed in 
accordance with European guidelines. TH-Cre male mice from the FI12 mouse 
line® between 3 and 5 months, DAT IRES-Cre*? and TH-Cre;Ai32° between 2.5 
and 6 months were used. 

Sample sizes, randomization and blinding. The number of animals in each 
experiment was based on previous studies using a power of 0.7 and a=0.05. 
nwas larger than 5 in all experiments (except imaging experiments) as required 
for the use of parametric statistics. No formal method of randomization was used; 
littermates were equally divided among the groups that were compared. There was 
no blinding of experimental groups. Every experiment contained all experimental 
groups that were tested concomitantly. The timing of optogenetic manipulations 
was controlled automatically, not by the experimenter. 

Recombinant adeno-associated viral vectors. The following Cre-dependent 
adeno-associated viral vectors were used in the experiments: AAV2/5.CAG.Flex. 
GCaMP6f.WPRE.SV40 (titre 1.19 x 10'3, University of Pennsylvania); AAV2/1. 
CAG.Flex.ArchT-GEP (titre 1.4 x 10)”, University of Pennsylvania); AAV2/1. 
ChR2-eYFP (titre 1.4 x 101°; University of North Carolina), AVV2/1.EFla.DIO. 
eYFP (titre 1.4 x 101°, University of North Carolina); AAV8/hSyn.Flex.Jaws-GFP 
(titre 4.2 x 10'?, University of North Carolina); rAAV8/hSyn.DIO.eGEFP, (titre 
4.9 x 10”, University of North Carolina). 

Virus injections, electrode, lens and fibre placement. Surgeries were performed 
using a stereotaxic system (Kopf). Mice were kept in deep anaesthesia using a 
mixture of isoflurane and oxygen (1-3% isoflurane at 1] min Ny, 

For imaging experiments, a 1 1] of virus solution was injected in the right 
substantia nigra compacta at the following coordinates: —3.16 mm anteroposterior, 
1.40 mm lateral from Bregma and 4.20 mm deep from the brain surface. The injec- 
tion was done through a glass pipette using a Nanojet II (Drummond Scientific) 
with a rate of injection of 4.6nl every 5s. After the injection was finished, the 
pipette was left in place for 10-15 min. The virus solution was kept at —80 °C and 
thawed at room temperature just before the injection. A 500-j1m diameter, 8.2-mm 
long gradient index (GRIN) lens (GLP-0584, Inscopix) was implanted at the same 
coordinates as the injection. Before the lens was lowered, a blunt 28G needle was 
lowered to 3mm deep from the brain surface to facilitate the lowering of the 
GRIN lens. The GRIN lens was then lowered (4.2 mm deep). The lens was fixed 
in place using cyanoacrilate and black dental cement (Ortho-Jet). One 1/16-inch 
stainless-steel screw (Antrin miniatures) was attached to the skull to provide a 
scaffold to build a dental-cement-based cap that protected and fixed the lens to 
the skull. 

Three weeks after surgery, the mouse was anaesthetized and fixed with head 
bars. A baseplate (BPC-2, Inscopix) attached to a mini epifluorescence microscope 
(nVista HD, Inscopix) was positioned above the GRIN lens. To correctly position 
the baseplate, brain tissue was imaged through the lens to find the appropriate focal 
plane using 20% LED power, a frame rate of 5 Hz and a digital gain of 4. Once the 
focal plane was set, the baseplate was cemented to the rest of the cap using the same 
dental cement. Imaging started 2-3 days after this final step. 

The same stereotaxic system and anaesthesia protocol was used for electrode 
and fibre placement. In the case of TH-Cre;Ai32 mice, no virus injection was used. 

The same coordinates were used for optrode placement, except for a depth of 
3.8-3.9 mm from the brain surface. The ground wire was attached to a 1/16-inch 
stainless-steel screw (Antrin miniatures), touching the surface of the brain. 

For optogenetic experiments (ChR2 and ArchT groups), the same procedure 
and coordinates were used, except that a 1.5-1l virus solution was injected 
bilaterally at 2.3 nl every 5s and optical fibres with a diameter of 230,1m anda NA 
of 0.39 (Thorlabs FMT 200 EMT) were placed bilaterally at a depth of 3.9mm from 
the brain surface. Optical fibres were built based on a published protocol*?. Two 
TH-Cre mice underwent the same virus injection protocol as the ArchT group and 
were used to obtain the data presented in Extended Data Fig. 5. 

For the Jaws experiment, the same procedure was followed except that 1 1l 
of virus was used and optical fibres with a diameter of 400 1m and a NA of 0.5 
(Thorlabs FP400URT) were implanted at the same depth. 

Optogenetic set-ups. For ChR2-expressing mice and corresponding controls, 
light from a free-launched 200-mW, 473-nm, diode-pumped, solid-state laser 
(Laserglow Technologies), controlled using an AOM (AA Optolectronic), was 
delivered after being captured by a collimator and split using a one-input to 
two-outputs rotary joint (Doric Lenses). In addition, 200-nm, 0.22 NA optical 
fibre patch cords were used to guide the light to the fibres implanted in the mice. 

For the ArchT and the corresponding controls, the same set-up was used, but 
with a different light source (free-launched 500-mW, 556-nm, diode-pumped, 
solid-state laser from CNI Lasers). 

For the Jaws group and corresponding controls, we used a red LED (around 
100mW maximum output, approximately 625 nm, Prizmatix). The light was 


captured by a large diameter optical fibre (1 mm), which connected to a one-input 
to one-output rotary joint. A branched 500-\1m optical fibre was then used to 
connect to the fibres that had been implanted in the mice. 

Light intensity was measured before and during experiments using a fibre 

similar to the ones implanted and a power meter (PD1000-S130C, Thorlabs). The 
power was adjusted at the tip of the fibre to be around 15 mW for photoidentifi- 
cation experiments (Fig. 1 and Extended Data Fig. 3), approximately 3 mW for 
ChR2 (Fig. 3), around 35 mW for ArchT experiments (Figs 2, 4 and Extended Data 
Fig. 5), and approximately 9 mW in Jaws experiments (Extended Data Fig. 10). 
Open field. We used a 39 x 39cm open field with black walls (17.5 cm height) 
and white acrylic floor to assess the spontaneous movement of mice. The open 
field was inside a sound-attenuating chamber. Illumination was provided by white 
(2700 K) LEDs (Dioder, Ikea) that were placed on the floor and symmetrically 
around the open field in a way that illumination of the open field was uniform 
and indirect (135 Ix). 
FR8 operant task. Behaviour training and testing took place in operant chambers 
as described previously’. In brief, each chamber (23cm L x 20cm W x 19.5cm H) 
was housed within a sound-attenuating box (Med-Associates) and equipped with 
one retractable lever on the left side of the food magazine and a house light (3 W, 
24V) mounted on the left lateral wall. Sucrose solution (10%) was delivered into a 
metal cup in the magazine through a syringe pump (2011 per reward). Magazine 
entries were recorded using an infrared beam and licks using a contact lickometer. 
Mice were placed on food restriction throughout training, and fed daily after the 
training sessions with approximately 2 g of regular food to allow them to maintain 
a body weight of around 85% of their baseline weight. 

Training started with a 30-min magazine training session, in which the 
reinforcer was delivered on a random time schedule, on average every 60s (30 
reinforcers). The following day lever-pressing training started with continuous 
reinforcement (CRE), in which animals obtained a reinforcer after each lever press. 
The session began with the illumination of the house light and insertion of the 
lever, and ended with the retraction of the lever and by turning off the house light. 
On the first day of CRE, the sessions lasted 45 min or until mice received five rein- 
forcers, the second day of CRF lasted 45 min or until mice received 15 reinforcers, 
and the last day of CRF lasted 45 min or until mice received 30 reinforcers. This 
last CRF session was repeated if mice failed to obtain 30 rewards within the time 
limit. After CRE, animals started to be trained (day 1) on a fixed ratio schedule in 
which eight presses earn a reinforcer (FR8), without any stimulus signalling when 
eight presses were completed or when the reinforcer was delivered; this training 
continued for 12-14 days. 

All timestamps of lever presses, magazine entries and licks for each animal were 

recorded with a 10-ms resolution. The same training chamber was used during 
imaging and optogenetic experiments. 
Acceleration and video recordings. In the experiments where photoidentified 
TH* neurons were recorded, acceleration was recorded using a digital 9-axis 
inertial sensor with a sampling rate of 200 Hz (MPU-9150, Invensense) assembled 
on a custom-made PCB and connected to a computer via a custom-made USB 
interface PCB (Champalimaud Foundation Hardware Platform). 

For the other experiments, an analogue 3-axis inertial sensor was used 
with a sampling rate of 1,000 Hz (LIS331AL, ST) assembled on a custom PCB 
(Champalimaud Foundation Hardware Platform), and the signals were fed to the 
analogue inputs of a Cerebus recording system (Blackrock Microsystems). 

Acceleration data obtained from these sensors aggregates acceleration from 
two sources: acceleration generated from gravity and from the body. To separate 
these two components we processed data obtained from both types of inertial 
sensor using a custom MATLAB code. We used a standard approach* that relies 
on filtering the 3-axis data using a Butterworth filter. In our analysis we used a 
median filter (7 bins wide) to remove noise peaks. Then we used a 1-Hz high-pass 
fifth-order Butterworth filter to separate the static (gravitational) component of the 
signal. Unless stated otherwise, we used the sum of the three vectors of acceleration 
as a global measurement of body acceleration for our analysis. The dynamic (body 
acceleration) component accurately tracked animal movement, and correlated well 
with pixel change in video measurements (Extended Data Fig. 1b-d). 

Video recordings were obtained using a charge-coupled device camera (DFK 
31BF03, Imaging Source) and a custom-developed software in Labview (National 
Instruments) at a rate of 15 frames per second. This software allowed us to intro- 
duce signals to the video frames to sync acceleration, neural recordings and light 
delivery periods. 

Classification of movement state. We used the average distribution of acceleration 
in the open field for each experimental group to define the movement state of the 
mice. This was possible, because the average distribution of the logarithm of total 
body acceleration was clearly bimodal, with a very low acceleration distribution 
corresponding to immobility periods (with the possible exception of small and 
slow postural adjustments) and a high acceleration distribution corresponding 
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to periods of mobility (see Fig. 1b and Extended Data Fig. 1 for a comparison 
between video and motion-sensor acceleration data). The acceleration threshold 
was defined as the lowest acceleration value between the two distributions. Unless 
stated otherwise in the methods, initiation events were defined as transitions 
between periods of at least 300 ms below the threshold followed by at least 300 ms 
above the threshold. 

Extracellular recordings of photoidentifed SNc DANs. We used a single-drive 
movable microbundle (sixteen 23-j1m tungsten electrodes) with an optic-fibre 
guide cannula (Innovative Neurophysiology). An optical fibre with 230 ,.m 
diameter and a NA of 0.39 (Thorlabs FMT 200 EMT) was inserted in the cannula 
just to the top of the electrode microbundle cannula (see Fig. 1a for schematics). 
The neural activity and the timestamps from the light stimulation were recorded 
using a Cerebus recording system (Blackrock Microsystems). 

Experiments were started one week after electrode placement. Every day, 
we sorted putative units using an online sorting algorithm (Central Software, 
Blackrock Microsystems) while the mouse was in its home cage. If putative single 
units were isolated, we delivered a screening protocol consisting of a train of 100 
blue light pulses with a 10-ms width delivered at 1 Hz. Using neurophysiology 
data analysis software (NeuroExplorer V4), we built PETH that were aligned to 
the train pulses. If any of the isolated units appeared to be modulated by the light 
train, the mouse was introduced to the open field and neurons were recorded for 
1h. The stimulation protocol was run again at the end of the open field session for 
confirmation. At the end of the experiment, the microbundle was advanced 501m 
to record next day. We used six mice in these experiments. During this experiment, 
microbundles were moved on average 433 + 93.1 1m. 

Units were resorted using an offline sorting algorithm (Offline Sorter V3, 
Plexon Inc.) to isolate single units on the basis of waveform characteristics, inter- 
spike intervals and clustering. Single units together with the timestamps of the 
light stimulation provided by a pulse generator (Master 8, AMPI) were exported 
to MATLAB for analysis. 

Criteria used to photoidentify DANs. Neural activity referenced to light pulse 
onset was averaged in 1-ms bins, and averaged across trials to construct a PETH, 
which was the basis for analysing amplitude and latency of light-related firing 
activity. Distributions of the PETH from —900 to —10 ms before light onset were 
considered baseline activity. We then determined which bins, slid in 1-ms steps 
during an epoch spanning from light onset to 50 ms after, met the criteria for 
significant firing rate increases. A significant increase in firing rate was defined 
as at least four consecutive bins had a firing rate larger than a threshold of five 
standard deviations above baseline activity. The latency in modulation of photo- 
identification was defined as the time between light onset and the first significant 
bin. On the basis of the distribution of latencies of significantly modulated neurons 
and in accordance with previous studies**°, we used a very short latency (<7 ms) 
for neural response to light, combined with at least a 30% increase in firing rate 
during the light pulse, and a high correlation coefficient between spike waveforms 
during light on and off (>0.9) as criteria for positive photoidentification (Extended 
Data Fig. 3). 

Criteria to identify neurons modulated by movement initiation. We built a 
PETH for each photoidentified single unit spanning from 1,500 ms before and 
after movement-initiation events. Neural activity was averaged in 100-ms bins, 
shifted by 1 ms (100 bins, centred on current bin). Distributions of the PETH 
from —1,000 to —500 ms before light onset were considered baseline activity. We 
then determined which bins, slid in 1-ms steps during an epoch spanning from 
—500 ms to 500 ms after movement initiation, met the criteria for significant firing 
rate changes. A significant change in firing rate was defined when at least 50 con- 
secutive bins had a firing rate higher or lower than a threshold of 2.56 standard 
deviations above or below baseline activity (99% confidence interval). The latency 
to modulation was defined as the time between movement onset and the first of 
the 50 consecutive significant bins. 

Area under the receiver operating characteristic curve (auROC) analysis. For 
this analysis, we used a method similar to the one described previously*>. We 
convolved the spike trains with a function similar to a post-synaptic potential”°. 
To produce the ROC curves, we compared the firing rate of each 50-ms bin to the 
firing rates during baseline (-1,500 to —1,000 ms before movement initiation) 
across trials. The auROC for each bin was calculated using trapezoidal numerical 
integration. 

Identification of neuron types using affinity propagation clustering. We used 
the affinity propagation algorithm”? to look for subtypes of significantly modulated 
neurons (Extended Data Fig. 3f). It is an efficient clustering algorithm that takes as 
inputs the similarities between pairs of observations in the dataset (in this case, the 
auROC traces for each modulated neuron), and finds exemplars and the clusters 
around them by exchanging real-valued messages between data points. We used 
the MATLAB (Mathworks) function made available by the authors of ref. 12 at 
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http://www. psi.toronto.edu/index.php?q=affinity%20propagation. We used the 
correlation between neuron auROC traces as the measure of similarity used by the 
algorithm. We also used maxits = 1,000, convits = 100; lam=0.9 and a preference 
equal to the median similarity of the dataset. 

Trial-by-trial analysis of significant increases in neuron activity before 
movement initiation. We defined a baseline distribution of the number of spikes 
per 200-ms bin for a period of 1s (from 2,200 ms before movement initiation to 
1,200 ms before movement initiation) for all trials. On the basis of this distribution, 
we determined a criterion for each neuron such that 90% of the 200-ms bins in 
the baseline had a lower number of spikes than the criteria. Trials were considered 
to present a significant increase in neuron activity when the number of spikes 
during the 200-ms bin immediately preceding movement initiation was above 
this criterion. 

Definition of movement initiation clusters and spread. We used a methodology 
previously developed by our group to use motion-sensor data to classify 
behaviour". 

We defined 1-s movement initiation trajectories specified by three motion- 
sensor variables: total body acceleration, the angular velocity of the axis most 
parallel to the dorsal—ventral axis of the mice and the gravitational acceleration 
of the same axis. We divided these trajectories in two 500-ms bins and for each 
bin we defined a vector composed by a concatenation of a normalized histo- 
gram for each motion-sensor variable. In the end, each trajectory was defined 
by the concatenation of two vectors, the first representing the distributions of the 
motion-sensor variables for the first 500 ms of initiation, and the second for the 
last 500 ms. Consequently, the final initiation vector was composed by six different 
motion-sensor variable distributions. We then determined a matrix representing 
the distances from each initiation vector to every other vector. To calculate the 
distances, the motion-sensor variable distribution of one vector was compared to 
the same motion-sensor variable distribution of another vector (range of 0-1; with 
0 indicating that the vectors are exactly the same and 1 indicating that the vectors 
are maximally different). The differences between each motion-sensor variable 
distribution were then squared and summed together to find the distance between 
the two initiation vectors that were evaluated (range 0-6). The final distance matrix 
was built by comparing each initiation vector in this way and squaring the final 
result, obtaining a matrix that varied from 0 (exactly the same) to 36 (maximally 
different). The spread values presented in Fig. 1i were determined by averaging 
the distances between movement initiations. 

After determining an initiation distance matrix for each session, we used affinity 
propagation”? to find clusters of initiations (see the description above regarding 
this methodology). We provided the affinity clustering algorithm with the additive 
inverse of the distance matrix as a similarity matrix. We also used maxits = 1,000, 
convits = 100; lam = 0.9. To make sure that we were selecting a consistent structure 
for the data, we used a value for preference between the minimum and the 
maximum similarity of each similarity matrix that provided the highest but at 
the same time most stable number of clusters (that is, a value was chosen in the 
middle of an interval of consecutive values that provided the same number of 
clusters). 
t-Distributed stochastic neighbour embedding for visualization of initiations. 
We used t-distributed stochastic neighbour embedding (t-SNE)"* to visualize and 
assess the existence of structure within the movement initiations of each session 
(see the example in Fig. 1h). We implemented this using a MATLAB code provided 
by the authors of ref. 13 (https://Ivdmaaten.github.io/tsne/). We used a 2D t-SNE 
using a perplexity of 15 to produce the image in Fig. 1h. 

DAN activity and movement vigour analysis. We used the mean acceleration 
during the first 500 ms of each spontaneous initiation as a measurement of 
movement vigour, and separated trials into low acceleration trials (lower tertile), 
medium and high acceleration trials (upper tertile). We then calculated the activity 
of positively modulated neurons during 300 ms before movement initiation for 
each acceleration tertile. A neuron was considered to be vigour related if the activity 
during the lower tertile trials was significantly lower than the activity during the 
upper tertile trials. 

GCaMP6f imaging using a mini-epifluorescence microscope. Mice were briefly 
anaesthetized using a mixture of isoflurane and oxygen (1% isoflurane at 11 min~') 
and the mini-epifluorescence microscope was attached to the baseplate. This was 
followed by a period of 20-30 min of recovery in the home cage before experiments 
started. Fluorescence images were acquired at 10 Hz and the LED power was set 
10-20% (0.1-0.2 mW) with a gain of 4. Image acquisition parameters were set to 
the same values between sessions to be able to compare the activity recorded. Three 
GCaMPéf-expressing TH-Cre mice were imaged while freely exploring an open 
field. The same mice and one more were also imaged during the FR8 task. Data 
shown in Fig. 4 were obtained in two consecutive late training sessions (between 
days 7 and 13). 
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Calcium image processing and analysis. GCaMP6f image processing and 
fluorescence trace extraction. All fluorescence movies were initially processed using 
the Mosaic Software (v.1.1.2, Inscopix). First, all frames were spatially binned by 
a factor of 4. To correct the movie for translational movements and rotations, the 
frames were registered to a reference image consisting of an average of the raw 
fluorescence movie. This was achieved by implementing the TurboReg registration 
engine’’ within the mosaic software. The movie was cropped after registration to 
remove the post-registration black borders. 

GCaMPé6f fluorescence trace extraction. Although calcium imaging using mini- 
scopes enables researchers to image neurons in freely moving mice, it is a challenge 
to adequately extract neuronal signals without background contamination. Because 
of this, we implemented the ‘constrained non-negative matrix factorization for 
endoscopic data (CNMF-E) framework''®, This recently described framework 
is an adaptation of the CNMF algorithm”. It can reliably deal with the large 
fluctuating background from multiple sources in the data, enabling accurate 
source extraction of cellular signals. It includes four steps: (1) initialize spatial and 
temporal components of single neurons without the direct estimation of the back- 
ground; (2) estimate the background given the estimated spatiotemporal activity of 
the neurons; (3) update the spatial and temporal components of all neurons while 
fixing the estimated background fluctuations; (4) iteratively repeat step 2 and 3. 
Criteria to identify DANs modulated by movement initiation using GCaMP6f 
imaging. We built a PETH for each neuronal trace spanning from 3s before to 
3s after movement-initiation events. For this analysis, we considered movement 
initiations as transitions between a period of at least 500 ms below to a period of 
at least 500 ms above the acceleration threshold. Distributions of the PETH from 
—3 to —1s before movement onset were considered baseline activity. We then 
determined which bins, during an epoch spanning from —0.5 before to 0.5s after 
movement initiation, met the criteria for significant AF changes. A significant 
change in AF was defined if at least 2 consecutive bins had AF higher or lower 
than a threshold of 99% above or below baseline AF. 

Criteria to identify lever-press-related and reward-related DANs using 
GCaMP6f imaging. We constructed a PETH for each neuron trace spanning 
from —3 to 3s from lever press onset for the first, second, third, third to final, 
second to final and final press, and also for the first lick of reward. Distributions 
of the PETH from —5 to —3 s before first lever press were considered baseline 
activity for all press-related activity and distributions from —5 to —3s of the first 
lick of reward PETH was considered as baseline for reward-related activity. We 
then searched each PETH during an epoch spanning from —0.5s to 0.5s for bins 
that were significantly different from the baseline. A significant change in fluores- 
cence was defined as at least three consecutive bins with fluorescence higher than 
a threshold of 99% above baseline AF. 

Extracellular recordings during SNc DAN inhibition. Optic fibres and electrodes 
were positioned 300 1m above the SNc. Light intensity was around 35 mW at the 
tip of the fibre, corresponding to an estimated” irradiance of 70-206 mW mm 7 
at SNc depth (200 to 400|1m from the fibre tip). Neural activity was recorded daily 
and the electrodes were moved 501m at the end of each recording session. We were 
thus able to record neural activity from different depths (—3.90 mm to —4.60 mm 
from the brain surface), with neurons being recorded above, within and below the 
SNc. We recorded from 140 units and observed that at depths where the SNc is 
located, more than 60% of recorded units were inhibited (Extended Data Fig. 5), 
whereas above and below the SNc, very few neurons were modulated. Furthermore, 
we only observed one single unit that was modulated by light at the depth closest 
to the fibre where light intensity is higher (—3.9 to —3.95mm, 0.7% of all units 
recorded, 7.7% of all units recorded at this depth), indicating that light delivery per 
se was not sufficient to change neural activity at this power. The same positioning 
of the fibres and light intensity was used during the open field and operant task 
inhibition experiments. 

We used the same set-up and methodology described for the extracellular 
recordings of photoidentifed SNc DANs, but instead of using blue light, we used 
green light (see ‘Optogenetics set-ups’). The mean auROC traces presented in 
Extended Data Fig. 5a, b were calculated as described for the photoidentification 
experiments. The anatomical scheme depicted in Extended Data Fig. 5b was based 
on the histological determined position of the electrode cannula and the amount 
of electrode travel at each recording session. 
Open field. Mice were introduced to the open field and green light was delivered 
continuously for periods of 15s (mean of 22 +3 trials with a mean inter-trial 
interval (ITI) of 64+ 28s per mouse in the ArchT group and a mean of 23 +4 
trials with a mean ITI of 65 + 33s per mouse in the YFP group). An open-field 
session was done on the day before to habituate mice to the open field and to light 
delivery (with similar trial structure except for light duration, which was 5s). Data 
regarding this first session are shown in Extended Data Fig. 6. 

Operant task. The same groups of ArchT and YFP mice used in the open- 
field inhibition experiment were trained to perform the FR8 task. Optogenetic 


experiments were started at the end of the FR8 training. We used two different light- 
delivery schedules: continuous light delivery for 5s before the first lever press 
in a sequence; and continuous light delivery for 5s after the first lever press in a 
sequence. For the first condition, we made the triggering of the light contingent 
on the breaking of an infrared beam (IRB) positioned right next to the magazine, 
on the side of the lever. This way, mice coming from consuming the reward would 
break the IRB before they started the next sequence. Sessions were divided into 
two blocks: a first block of 15 trials without light delivery; and a second block of 
10 trials with light delivery. The last 10 trials with no light delivery were used to 
compare with the 10 trials with light delivery. In a few trials, the mouse failed to 
break the IRB before starting the action sequence. These trials were discarded 
and not included in the analysis. Sessions with no light delivery were interspersed 
between sessions with light delivery and a first session with light delivery was only 
used to habituate the mice to the delivery of light and it was not analysed. The same 
methodology was used in Jaws experiments with the exception that instead of light 
on and light off blocks there was a 30% probability of switching on the laser for each 
trial and data were collected in three consecutive sessions for each experimental 
condition (inhibition before initiation and inhibition after first press). 

SNc DAN activation. Open field. Mice were introduced to the open field and a 
train of blue laser light (10 ms pulses at 20 Hz, during 0.5) was delivered with a 
variable interval: after 90s there was a 33% probability that the light was delivered 
and this was repeated every 10s until light was delivered. 

Closed loop experiment in the open field. For closed loop experiments, the same 
light pulse train was used, but it was delivered depending on the acceleration state 
of the mouse in the following way: acceleration of mice was monitored online 
by feeding the analogue accelerometer data through a Cerebus recording system 
(Blackrock Microsystems) into MATLAB (Mathworks) using Blackrock’s MATLAB 
interface (CBMEX). Using a custom MATLAB code, we processed accelerometer 
data as described above. We monitored acceleration of mice using bins of 300 ms 
and when mice reached 900 ms below the threshold used to identify immobility, 
light was delivered with a 50% probability. There was a minimum of 30s between 
trials. In this experiment, we considered the maximum acceleration during the 
first second after each movement initiation as a measurement of initiation vigour. 
Anatomical verification. Animals were euthanized after completion of the 
behavioural tests. First, animals were anaesthetized with isoflurane, followed 
by intraperitoneal injection of ketamine-xylazine (around 5 mg kg"! xylazine; 
100 mg kg! ketamine). Animals were then perfused with 1x phosphate- 
buffered saline (PBS) and 4% paraformaldehyde, and brains were extracted for 
histological processing. Brains were kept in 4% paraformaldehyde overnight and 
then transferred to 1x PBS solution. Brains were sectioned coronally in 50-ym 
slices (using a Leica vibratome (VT1000S) and kept in PBS solution before 
mounting or immunostaining experiments). Images were taken using a wide- 
field fluorescence microscope (Zeiss AxioImager) and the tip of the longest track 
found was used to determine the anatomical location of the implants (lenses, 
fibres and electrodes), which was represented in the corresponding Allen Brain 
Atlas"! slice (Extended Data Fig. 7). 

To estimate the specificity of the TH-Cre line, we crossed TH-Cre mice with 
ROSA26-eGFP mice. TH-Cre;ROSA26-eGFP mice express GFP in neurons 
expressing Cre recombinase. We used slices from the brain of one TH-Cre;ROSA26- 
eGFP mouse and used a tyrosine hydroxylase antibody (ImmunoStar) to label the 
ventral tegmental area (VTA) and SNc DANs (Extended Data Fig. 2). We imaged 
the VTA and SNc in three different slices (approximately —2.9 mm, —3.3 mm, 
—3.8mm from Bregma) using a confocal microscope equipped with a Diode 
405-nm, Argon multi-line 458—488—514-nm and DPSS 561-nm lasers (LSM710, 
Zeiss). We acquired z stacks (354,1m x 354|1m x 541m; 5-j.m interslice interval) 
in a tile that covered the VTA, SNc and areas 200|1m above and below the SNc. 
We imported these images into the Stereo Investigator software (MBF Bioscience) 
and used a stereological approach to count labelled cells (TH* and Cre*) and 
evaluate co-localization, using a 100j1m x 100}1m counting frame (Extended Data 
Fig. 2a-c). 

We also used a stereological approach to estimate the rate and specificity of 
infection of the AAV2/1.CAG.Flex.ArchT-GFP and the density of TH* neurons 
in the SNc of ArchT and YFP TH-Cre mice (ArchT, n= 2; YFP, n=2). We imaged 
every six slices in which the VTA and/or SNc were found, using the confocal micro- 
scope described above (three slices per mouse). Using a 40x magnification, we 
acquired z stacks (354,1m x 354 1m x 51m; 5-j1m interslice interval) in a tile that 
covered the whole SNc. These tile z stacks were imported into the stereo investi- 
gator software (MBF Bioscience) and quantification of the TH*, eYFP* or GFPT 
cells was performed using a 100j1m x 100j1m counting frame (Extended Data 
Fig. 2d-h). 

Statistics. Statistical hypothesis testing was done at a 0.05 significance level 
(except for classification of neurons, which was done at a 0.01 significance level as 
explained above). Parametric testing was used whenever possible to test differences 
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between two or more means. Normality was tested using the Shapiro- Wilk test, 
wheras F-tests (for unpaired t-tests) and Levene’s tests (for ANOVA) were used 
to assess equality of variance. If data were not normally distributed, we first tried 
to transform the data using the natural logarithm. If the distribution of the trans- 
formed data was still not normally distributed or there was a significant differ- 
ence in variance, an alternative non-parametric test was used. ANOVA and linear 
mixed models were used to check for main effects and interactions in experiments 
with repeated measures and more than one factor. The assumptions for linear 
mixed models were checked by careful inspection of the model residuals to check 
for normality and equality of variances. When main effects or interactions were 
significant, we did planned comparisons according to experimental design (for 
example, comparing laser on and off). Fisher's least significant difference tests 
were used for comparisons after ANOVA tests and least square means tests were 
used for comparison when linear mixed models were significant. Details on the 
statistical analysis used for hypothesis testing in the main figures can be found in 
the Supplementary Table 1. Statistical tests were done using Prism (GraphPad), 
MATLAB (MathWorks) statistical toolbox and R (R core team 2015, v.3.1.3, 
Ime4”). 

Code availability. MATLAB (MathWorks) codes used for data analysis are 
available from the corresponding author. 

Data availability. Source Data for Figs 1-4 have been provided with the online 
version of the paper and all other data that support the findings of this study are 
available from the corresponding author upon reasonable request. 
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Extended Data Figure 2 | TH-Cre line and ArchT infection 
characterization. a, TH-Cre mice were crossed with ROSA26R-YFP 
mice (expression of YFP in Cre* cells). This is an example of a midbrain 
slice of a TH-Cre x ROSA26R-YFP mouse with TH* neurons labelled 

in red. The white line delimits the SNc, and the yellow and green lines 
delimit areas that cover a depth of 200 1m above and below the nigra, 
respectively, that were also targeted by stereological cell counts. Scale 

bar, 100m. b, Example of a SNc sampling field. Arrowheads denote 
examples of Cre* cells that were TH™. Scale bars, 20|1m. c, Quantification 
of the specificity of the Cre line for tagging TH* cells (n= 3 slices; 117 
counting frames were analysed). d, Representative merged image of VTA 
and SNc after two weeks of infection. ArchT* cells are labelled in green, 
TH cells are labelled in red and merged colours in yellow. ArchT* cells 
are mainly confined to SNc. Scale bar, 500 1m. e, Detail of a SNc region 
labelled for TH (red) and ArchT (green) expression. Arrows are examples 
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of TH* and ArchT* cells; closed arrowheads denote examples of TH* 
and ArchT™ cells. Scale bars, 201m. f, Efficiency of ArchT virus infection 
(left). Specificity of ArchT virus infection (right). This was calculated by 
quantifying the whole SNc stereologically, not only the area closest to 

the infection (n =6 slices from two ArchT mice, 122 counting frames). 

g, Examples of the slices and fields used to do the stereological count 
shown in f and h. Scale bars, 100 1m. h, Stereological quantification of the 
number of SNc TH? cells in YFP- and ArchT-expressing mice after two 
weeks of infection (ArchT, n=6 slices from two ArchT mice, 122 counting 
frames; YFP, n =6 slices from two YFP mice, 124 counting frames). 

i, Photomicrograph of a midbrain slice of a ArchT-expressing mouse 

at the end of the experiments (open field and FR8). Red indicates TH* 
cells and green indicates ArchT* cells. Scale bar, 100 1m. Data are 

mean +s.e.m. (c, f, h). 
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Extended Data Figure 3 | Photoidentification and clustering of SNc 
dopamine neurons. a, Photomicrograph of a midbrain slice of a TH- 
Cre;Ai32 mouse denoting the right SNc and VTA. ChR2 in green and 
TH? cells in red. Initial electrode position (dashed square) and distance 
travelled (dashed triangle). Scale bar, 100|1m. b, Example of continuous 
recording of a photoidentified neuron. Blue triangles denote 10-ms light 
pulses of blue light that were delivered at 1 Hz. c, PETH of the neuron in 
b aligned to blue-light delivery (100 pulses). d, Histogram of latencies to 
modulation by light delivery. A threshold of 7 ms and an increase in 

at least 30% firing rate was used to define neurons as photoidentified 
(blue bars). e, Mean spike traces for all photoidentified neurons used in 
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Fig. 2. The black trace represents the mean of spikes obtained without light 
delivery and the blue trace represents the mean trace of spikes obtained 
during light delivery. f, Left, the area under the ROC curve (auROC) 

was calculated for each time bin of each significantly modulated neuron. 
Right, we used an affinity propagation algorithm to cluster the traces that 
resulted from the auROC analysis (see Methods for details). Four clusters 
were found, of which the PETH of the representative neuron is shown. 
Neurons were: transiently active before the initiation of movement (blue), 
transiently active before the initiation of movement followed by inhibition 
after the initiation (grey), sustained increase in activity with movement 
initiation (green) or negatively modulated (red). 
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Extended Data Figure 5 | In vivo external recordings reveal specific 
inhibition of neuronal activity in the SNc. a, Mean unit activity aligned 
to light onset (values less than 0.5 auROC indicate a decrease compared to 
baseline and more than 0.5 auROC indicate an increase compared to 
baseline) at different recording depths. The green rectangle signals 

the duration of light delivery. Left, mean of all units recorded. Right, 

mean of all units except negatively modulated units. b, Top, anatomical 
representation*’ of the mean unit activity depending on recording depth 
and the location of the cannula of the recording electrode (red, decrease 
from baseline; blue, increase from baseline). The percentage of inhibited 
cells was not homogeneous throughout all depths (74,140 = 18.01, P< 0.05, 
test based on five levels of depth from -3.9 to —4.6 mm with 150-1m steps). 
In fact, when we investigated the mean activity of all units recorded at 
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each depth, we found that the mean activity during light delivery changed 
depending on the depth, and it was only significantly different at the 
depths at which the SNc is located, for which the percentage of inhibited 
units was 61.3%. This is anatomically represented in b. Depth (number 

of neurons): —3.9 mm (3); —4mm (22); —4.1 mm (14); —4.2 mm (6); 
—4,.3mm (8); —4.4mm (10); —4.5mm (4); —4.6 mm (5). Kruskal-Wallis 
test: H = 18.22; P=0.011. Dunn’s multiple comparison test, all means 
compared to mean at —4.6 mm: 3.9mm, P> 0.99; -4mm, P=0.078; 
—4,1mm, *P=0.017; —4.2mm, **P= 0.008; —4.3 mm, P= 0.67; 
—4.4mm, P=0.82; —4.5mm, P> 0.99). Asterisks indicate depths with 
mean auROC significantly different from —4.6 mm depth. c, Example of a 
single unit inhibited by green light. The mouse brain has been reproduced 
with permission from ref. 31. 
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Extended Data Figure 6 | Five-second inhibition of SNc TH* neurons denotes the threshold used to classify acceleration state. c, Acceleration 
during mobile and immobile trials. a, Acceleration during laser-off during brief laser-on (5s inhibition) normalized to mean laser-off 
and brief laser-on trials (5s inhibition) when ArchT mice were mobile acceleration for immobile and mobile states (1 = 17 laser-on immobile 
before the start of the trial (1 = 217 laser-on trials and n = 212 laser-off trials obtained from 5 ArchT mice, n = 12 laser-off immobile trials 
trials obtained from 11 ArchT mice). The horizontal dotted line denotes obtained from 7 ArchT mice; m= 217 laser-on mobile trials, n = 212 laser- 
the threshold used to classify acceleration state. b, Acceleration during off mobile trials obtained from 11 mice). Acceleration state significantly 
laser-off and brief laser-on trials (5s inhibition) when ArchT mice were affected normalized acceleration (linear mixed model with ‘mouse’ as a 
immobile before trial start (1 = 17 laser-on trials from 5 ArchT mice and random effect, F= 19.57, P< 0.0001). 


n= 12 laser-off trials obtained from 7 ArchT mice). Horizontal dotted line 
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Extended Data Figure 7 | Anatomical position of fibres, electrodes and the mouse brain and anatomical structures was obtained from the Allen 
lens. a, Fibre placement (green, ArchT mice; yellow, YFP mice). b, Fibre Mouse Brain Atlas (2004) using API. Top, http://api.brain-map.org/api/v2/ 
placement (blue, ChR2 mice; yellow, YFP mice). c, Optrode placement. svg_download/100960073?¢groups=28; middle, http://api.brain-map.org/ 
Horizontal blue lines denotes the cannula position and vertical lines the api/v2/svg_download/100960057?groups=28; bottom, http://api.brain- 
distance travelled by the electrodes. d, GRIN lens placement (green bar map.org/api/v2/svg_download/100960525?groups=28. 


indicates the position of the bottom of the lens). The representations of 


© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


LETTER 


day 9 


day 1; day 3; day 6; day 9 


o 


o 40 
Cc 

So 
@ ® 30 
38 
& o@ 20 
= Ob 
32 
fo} 

E 10 


0 
S S S » ~ 
S 2 oO 
a £ Kg RS ce AS 
x te) eo) 
3 o & * * 
‘ s % S S 
e e 
es 
re o 
5 s 
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end of training. Scale bars, 100m. b, Percentage of neurons significantly not available for one mouse. 


© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


LETTER 


First press modulated neuron Reward modulated neuron 
rm) 
o 5 
a § 
2 B 
=e 8 
2 w 
3 £ 
5 a 
< 
S 1.5 
Bog 
z 3 
SN 05 
2 uw 
9° © 
- = 0 
BOA 
c 
2 -0.5 
* 4 0 1. 3 0 
time (s) time (s) 
2500 first press & reward 
2000 | 
= 1500 
DS 
8 1000 
500 
0 
02 4 6 8101214 
number of overlapping neurons 
Extended Data Figure 9 | First press- and reward-related neurons a distribution of the number of overlapping neurons for first press 
populations do not overlap. a, Example of a neuron modulated by first and reward, assuming random assignment. Red lines denote the 95% 
press but not reward (left) and a neuron related to reward but not first confidence interval. Dashed line represents the number of overlapping 
press (right), aligned to first press (blue) and reward consumption (red). neurons found in our experiment. 


b, Monte Carlo simulations (10,000 samples) were used to generate 
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Extended Data Figure 10 | FR8-inhibition experiment using Jaws. We 
replicated the result obtained in the FR8 task (Fig. 4h-j), using Jaws 
(see Methods for details). a, Latency to initiate lever press sequence for 
laser-off trails and trials with inhibition starting just before sequence 
initiation for both Jaws (n =6) and GFP (n =6) groups. Two-way mixed 
ANOVA; planned comparisons between laser-on and laser-off trials using main effect laser F).5 = 0.53, P=0.49; interaction effect F),5= 1.01, 
Fisher’s least significant difference tests; main effect group F),19 = 3.16, P=0.34; planned comparisons: Jaws laser off — laser on P= 0.86, GFP 
P=0.11; main effect laser F),19 = 9.074, P=0.0131; interaction effect laser off — laser on P=0.23. 
Fi 19 = 0.92, P= 0.36; planned comparisons: Jaws laser off — laser on 


P=0.019, GFP laser off — laser on P=0.18. b, Press rate in trials with no 
light delivery and trials with light delivery starting after the first press 
for both Jaws (n= 5) and GFP (n=6) groups. Two-way mixed ANOVA; 
planned comparisons between laser-on and laser-off trials using Fisher's 
least significant difference tests; main effect group F)9= 1.607, P=0.24; 
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To facilitate clinical trials of disease-modifying therapies for 
Alzheimer’s disease, which are expected to be most efficacious 
at the earliest and mildest stages of the disease’”, supportive 
biomarker information is necessary. The only validated methods 
for identifying amyloid-3 deposition in the brain—the earliest 
pathological signature of Alzheimer’s disease—are amyloid-8 
positron-emission tomography (PET) imaging or measurement of 
amyloid-( in cerebrospinal fluid. Therefore, a minimally invasive, 
cost-effective blood-based biomarker is desirable**. Despite much 
effort?-’, to our knowledge, no study has validated the clinical 
utility of blood-based amyloid-3 markers. Here we demonstrate the 
measurement of high-performance plasma amyloid-8 biomarkers 
by immunoprecipitation coupled with mass spectrometry. The 
ability of amyloid-8 precursor protein (APP) ¢69_-71;/amyloid-3 
(AB);-42 and A®,_49/AB 1-42 ratios, and their composites, to 
predict individual brain amyloid-(-positive or -negative status 
was determined by amyloid-8-PET imaging and tested using two 
independent data sets: a discovery data set (Japan, n =121) anda 
validation data set (Australia, n= 252 including 111 individuals 
diagnosed using !'C-labelled Pittsburgh compound-B (PIB)-PET 
and 141 using other ligands). Both data sets included cognitively 
normal individuals, individuals with mild cognitive impairment and 
individuals with Alzheimer’s disease. All test biomarkers showed 
high performance when predicting brain amyloid-6 burden. In 
particular, the composite biomarker showed very high areas under 
the receiver operating characteristic curves (AUCs) in both data 
sets (discovery, 96.7%, n = 121 and validation, 94.1%, n= 111) with 
an accuracy approximately equal to 90% when using PIB-PET as 
a standard of truth. Furthermore, test biomarkers were correlated 
with amyloid-8-PET burden and levels of AG j_42 in cerebrospinal 
fluid. These results demonstrate the potential clinical utility of 
plasma biomarkers in predicting brain amyloid-8 burden at an 
individual level. These plasma biomarkers also have cost-benefit 
and scalability advantages over current techniques, potentially 
enabling broader clinical access and efficient population screening. 

Attempts to use conventional enzyme-linked immunosorbent assay 
(ELISA)-based techniques to assess plasma amyloid-6 (AQ) levels in 
patients have not been successful (see Supplementary Information for 
more detailed background information). Immunoprecipitation—-mass 
spectrometry (IP-MS) assays have been proposed®*’ as an alternative, 
although the sample sizes in both of these studies were small (n= 62 
and n= 41, respectively). Using IP-MS, we originally developed a 
plasma biomarker that discriminated individuals with high levels of 
A® (A8*) from individuals with low levels (AB~) with more than 90% 
sensitivity and specificity when classified using PIB-PET®. In that study, 


we used IP-MS with matrix-assisted laser desorption ionization-time- 
of-flight (MALDI-TOF) mass spectrometry, which can also be used for 
protein quantification!®", to measure the ratio of plasma A®,_4) toa 
novel APP¢¢9_711 fragment (A PP¢69-713/A{1_42) (Extended Data Fig. 1a). 
Here we improved the general applicability and reproducibility of 
the previous IP-MS methodology through exploratory studies. 
We found that the ratio of ABy_49/AB 1-42 also performed at the same 
level as APP¢¢9-711/A81-42, and that a composite biomarker score that 
incorporated both could further improve performance (Supplementary 
Information and Extended Data Fig. 1b). Thus, we hypothesized that 
APP ¢69-711/AB1-42, AB1-40/A81-42 and the composite biomarker gen- 
erated by the IP-MS assay were promising and potentially clinically 
useful candidates for plasma biomarkers as surrogates for brain AB 
burden. Our retrospective cross-sectional study tested this hypothesis 
in a discovery data set from the Japanese National Center for Geriatrics 
and Gerontology (NCGG) (121 samples), and was externally validated 
using an independent data set derived from the Australian Imaging, 
Biomarker and Lifestyle Study of Ageing (AIBL)!* cohort (252 sam- 
ples) (Table 1). Both data sets include a balanced number of individ- 
uals clinically classified as cognitively normal, individuals with mild 
cognitive impairment (MCI) and individuals clinically diagnosed with 
Alzheimer’s disease (AD) with dementia. All samples had correspond- 
ing AB-PET data obtained using PIB (NCGG and AIBL), flutemeta- 
mol (FLUTE) or florbetapir (FBP) (AIBL). Information on the levels 
of AB in cerebrospinal fluid (CSF A@) was available for a subset of 
the AIBL cohort. The primary aim of the study was to assess the per- 
formance of plasma-A8 biomarkers for determining an individual's 
status of AB deposition, using PIB-PET as the standard of truth. For 
secondary outcomes, we examined the performance of the plasma-A8 
biomarker against other PET ligands (FLUTE and FBP) and within 
clinical categories (cognitively normal, MCI, and AD). We also exam- 
ined the correlations of plasma-A8 biomarkers with AB-PET burden 
and CSF AB values. 

Figure 1 and Extended Data Fig. 2a show the normalized intensity 
of plasma AB as measured by IP-MS and the values of test biomarkers 
for each study site. The test biomarker values were generated by com- 
puting the ratio of the normalized intensity of the peptides. AB,_4. was 
used as denominator (APP¢¢69_71;/A8j_42 and A®4_49/AG1_42), because 
it yielded normal distributions (Extended Data Fig. 2b). The com- 
posite biomarker was generated by combining normalized scores of 
APP669-711/A81-42 and ABy_49/ABy_42 with a pre-determined weight of 
1:1 (Methods, Supplementary Information and Extended Data Fig. 1b). 
All of the test biomarkers showed highly significant differences 
(P < 0.0001, two-sided Student’s t-test or Welch's t-test) between the 
A$* and AB~ groups (Extended Data Fig. 2a). At a single peptide level, 
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Table 1 | Demographics of the subjects in each study site (NCGG and AIBL) 


NCGG AIBL 
PET tracer PIB PIB FLUTE FBP AIBL overall CSF 
Sample size (n) total 121 111 81 60 252 46 
AB*+/AB— 50/71 60/51 47/34 30/30 137/115 25/2 
CN/MCI/AD (A3+ + AB~) 62/30/29 63/33/15 43/30/8 50/4/6 156/67/29 30/9/7 
CN/MCI/AD (AB*) 10/20/20 25/20/15* 20/19/8 21/3/6 66/42/29 13/5/7 
CN/MCI/AD (AB~) 52/10/9 38/13/0** 23/11/0 29/1/0 90/25/0 17/4/0 
Age (AB* + A38~, mean -s.d.) 74.0+5.1 75.3465 721445 74.845.2 74.2458 73.7£5.5 
Age (A8t, mean +s.d.) 75.3+£4.7 75.3463 721444 75.7+4.8 74.34£5.6 724442 
Age (AS, mean= s.d.) 13.0252 754+6.8 72.0+4.5 73.9454 74.0+6.0 75.1464 
Gender (ABt + A3~, M/F) 55/66 56/55 40/41 33/27 129/123 26/20 
Gender (A8*, M/F) 22/28 32/28 28/19 11/19 71/66 8/7 
Gender (A8~, M/F) 31/40 24/27 12/22 22/8 58/57 1/10 
APOE4 (AB* + A3~, +/—) 50/71 53/58 34/47 21/39 108/144 5/3 
APOE4 (AB*, +/—) 35/15 42/18 30/17 15/15 87/50 9/16 
APOE4 (AB~, +/—) 15/56 11/40 4/30 6/24 21/94 2/19 
SUVR (ABt + AB~, mean -s.d.) 1.52+0.51 1.75+0.61*** 0.71+40.23 1.12+0.21 1.71+40.55 .64+0.47 
SUVR (A8t, mean+s.d.) 2.05+0.37 2.2440.37*##* 0.86+0.2 1.29+0.17 2.1140.43 .98+0.37 
SUVR (A3, mean +s.d.) 1.14+0.10 1.17+0.08 0.51+0.02 0.95+0.05 1.23 40.09 .24+0.10 
Breakdown of the number of subjects for each variable, with the exception of age and standardized uptake value ratio (SUVR) values. SUVR values represent SUVR for PIB, FLUTE and FBP, and 


SUVR/BeCKeT (before the centiloid kernel transformation?”) values for AIBL overall and CSF. Site differences between NCGG and AIBL were tested only for PIB-PET groups using Student's t-test 
(age and SUVR) or x? test (all others). The CSF group is a subset of the AIBL data including 17 PiB, 18 FLUTE, and 11 FBP cases. Asterisks indicate statistically significant site differences: *P=0.043, 


**P=0,014, ***P=0.002, ****P =0,.007; two-sided tests. 


AQGj-42 also showed highly significant group differences (P < 0.0001), 
whereas APP¢69-71; did not show any group differences, and A®1_49 
showed a group difference in the NCGG data set (P=0.011), but not 
in the AIBL data set. Significant (P < 0.05) site differences between the 
NCGG and AIBL data sets were seen for all peptides and biomarkers 
except for APP ¢69_711. 

To evaluate the performance of plasma biomarkers in predicting brain 
A{ burden, we conducted receiver operating characteristic (ROC) anal- 
yses with the discovery and validation data sets (Fig. 2a and Extended 
Data Table 1a, left). A8;_4. peptide alone showed moderately high areas 
under the curves (AUCs) in the discovery (NCGG) and validation 
(AIBL PIB and AIBL overall) analyses with values (87.2%, 75.7% and 
71.8% for NCGG, AIBL PIB and AIBL overall, respectively) far beyond 
the chance level of AUC = 50% (asymptotic significance, P < 0.0001). 
Compared with A _42, all of the test biomarkers (APP ¢69_714/ 
AG 1-42, AB1-49/A8)_42 and the composite biomarker) showed signifi- 
cantly better predictive ability as evaluated by the net reclassification 
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Figure 1 | The peptide and biomarker values in each study site. Box plots 
showing each peptide (upper), and test biomarker (lower) value in the 
NCGG (n= 121) and AIBL overall (n = 252) data sets. Significant group 
differences are indicated by P values (two-sided Student's t-test or Welch’s 
t-test). The boxes represent the 25th, 50th (median) and 75th percentiles of 
the data; the whiskers represent the lowest (or highest) datum within 1.5 x 
interquartile range from the 25th (or 75th) percentile. See Extended Data 
Fig. 2a for detailed values. AU, arbitrary units. 
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improvement (NRI) and the integrated discrimination improvement 
(IDI) (see Methods) in all analyses (Bonferroni-corrected P< 0.05) 
(Fig. 2b). In addition, the AUCs of these three test biomarkers were 
significantly higher than those of A®j_2 in all analyses (DeLong test, 
Bonferroni-corrected P < 0.05) (see Methods) except for APP¢69-711/ 
AB,_42 in the NCGG data set. The composite biomarker showed the 
highest AUCs in all analyses (96.7%, 94.1% and 88.3%, respectively, 
for NCGG, AIBL PIB and AIBL overall). In the AIBL PIB and overall 
analyses, the composite biomarker showed significant improvements 
in NRI and IDI compared with both APP¢69_71;/AB 1-42 and AB1_40/ 
AG -42 (Bonferroni-corrected P< 0.01) (Fig. 2b). In the NCGG data 
set, AB _49/AB1_42 showed identically high performance to the com- 
posite biomarker. Comparisons between the NCGG PIB and AIBL 
PIB analyses demonstrated that performances were generally lower 
in the validation analyses, especially for ABj~49/A8j-42 (DeLong test, 
uncorrected P= 0.026); however, the composite biomarker showed 
similarly high performances with an AUC of approximately 95% 
and approximately 90% accuracy. The AIBL overall (all PET tracers) 
analyses showed slightly lower performances compared with the AIBL 
PIB analyses. The biomarker performances in the analyses adjusted 
for age, gender, clinical category and the presence of the APOE-¢4 
(APOE4) allele showed a similar tendency to the unadjusted analyses, 
while the adjusted analyses generally showed slightly higher AUCs than 
the unadjusted analyses (Extended Data Fig. 3a, b and Extended Data 
Table 1a, right). 

As the composite biomarker showed the highest and most stable 
performance across all analyses, it was the main focus of subsequent 
tests. We further analysed the performance of the composite biomarker 
against different AB-PET tracers. When the '*F-A@ ligands FLUTE and 
FBP were used to classify participants into AB* or AB~ groups, the 
performances of the biomarkers were slightly lower than those obtained 
with PIB (Fig. 2c and Extended Data Table 1b, left). Within the AIBL 
data set, the AUCs of the composite biomarker for FLUTE (82.9%) 
and FBP (86.4%) were lower than for PIB (94.1%) in the unadjusted 
analyses (DeLong test, uncorrected P= 0.033 and 0.149 for FLUTE 
and FBP, respectively). The adjusted (age, gender, APOE4, and clinical 
category) analyses showed similar results (Extended Data Fig. 3c and 
Extended Data Table 1b, right). Given that there were no significant 
differences between the two independent PIB data sets, we consider 
it to be unlikely that variability in the biomarker performance causes 
the lower relative performance observed with '*F-A8 tracers. It may 
instead be the consequence of the higher variance and lower perfor- 
mance of the !8F-A8 tracers compared to PIB!>-!¢ (see Supplementary 
Discussion). 
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1-specificity 
Figure 2 | High performance of the plasma biomarkers. a, ROC analyses 
for each biomarker when predicting individual AB*/A@™ status for the 
discovery and validation data sets. Unadjusted analyses of the NCGG 
PIB discovery data (left), the AIBL PIB (middle) and AIBL overall (all 
tracers, right) validation data. See Extended Data Table 1a for detailed 
performance values. Data are from 121, 111 and 252 individuals for the 
NCGG PIB, AIBL and AIBL overall data, respectively. b, Comparisons of 
biomarker performances within each analysis corresponding to the ROC 
curves in a. Each colour bar represents the AUC and 95% confidence 
interval. Statistically significant differences between two AUCs (DeLong 
test) are indicated by asterisks: *P < 0.05, **P< 0.01, ***P< 0.001. 
Significant increments in predictive ability as assessed by NRI and IDI 
are indicated by daggers and double daggers, respectively. + or $P < 0.05; 
Tt or $£P < 0.01; ttt or $£4P < 0.001. All P values are two-sided and 


Plasma-biomarker performances were also evaluated by clinical 
category. To obtain a sufficient number in each subpopulation (as esti- 
mated by power analysis; see Methods), AD and MCI were analysed as 
one group (AD/MCI), and compared with the cognitively normal group. 
The data for FLUTE and FBP were also grouped and analysed together 
as '8F-AB tracers. The results of the ROC analyses for the composite bio- 
marker are shown in Fig. 2d, e and Extended Data Table 1c (left). Within 
the clinical category of AD/MCI, the performance of the composite 
biomarker against PIB and !$F-A@ tracers was very high, with AUCs of 
97.4% and 89.4% and accuracies of 91.7% and 89.6%, respectively, in the 
unadjusted analysis for the validation AIBL data. Within the cognitively 
normal group, performance against PIB was also high (AUC=91.7%, 
accuracy = 87.3%); however, performance against 'SF-A8 tracers was 
considerably lower (AUC = 80.0%, accuracy = 79.6%), and it did not 


1-specificity 


1-specificity 

Bonferroni corrected (multiplied by the number of comparisons, 6). 

NS, not significant. c, Unadjusted ROC analyses of the composite 
biomarker compared with different PET tracers; PIB (NCGG, n= 121, 
and AIBL, n= 111), flutemetamol (AIBL, n= 81), and florbetapir (AIBL, 
n= 60). See Extended Data Table 1b for detailed performance values. 

d, e, Unadjusted ROC curves of the composite biomarker within the AD and 
MCI (d), and cognitively normal (e) groups. For the AD and MCI group, 
data are from 59 individuals for NCGG PIB and from 48 individuals 

for AIBL PIB and the '8F-AG tracers. For the analyses of the cognitively 
normal group, data are from 62, 63 and 93 individuals for NCGG PIB, 
AIBL PIB and the !8F AB tracers, respectively. See Extended Data Table 1c 
for detailed performance values; corresponding results of the adjusted 
analyses for a-e are shown in Extended Data Fig. 3. 


reach significance (DeLong test, uncorrected P=0.053). Adjusted (age, 
gender and APOE4) analyses showed similar results (Extended Data 
Fig. 3d, e and Extended Data Table Ic, right). 

To evaluate the strength of the link between plasma biomarkers and 
AG-PET burden, we conducted correlation analyses. All of the plasma 
biomarkers, including A{_4. peptide alone, showed significant correla- 
tions with AB-PET burden (Fig. 3a and Extended Data Fig. 4a—d). The 
strongest correlations were found between PIB standardized uptake 
value ratio (SUVR) and the composite biomarker in the NCGG, AIBL 
and NCGG + AIBL combined data sets, with correlation coefficients 
of r=0.785, 0.684 and 0.735, respectively (Pearson’s correlation coef- 
ficient, all P< 0.0001). The correlation coefficients against FLUTE- 
SUVR (r=0.598, P< 0.0001) and FBP-SUVR (r=0.535, P< 0.0001) 
were slightly lower than those observed with PIB. The correlation 
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Figure 3 | Plasma biomarkers are significantly correlated with brain 
A® burden and CSF-A{j_4) level. a, Composite biomarker values plotted 
against SUVR values of A3-PET imaging for each tracer. Data are from 
121, 111, 81 and 60 individuals for the NCGG PiB, AIBL PiB, AIBL FLUTE 
and AIBL FBP analyses, respectively (see also Extended Data Fig. 4). 

Note that in the NCGG data, there are nine patients with AD who were 
clinically diagnosed as having AD but were PIB-PET classified as AB~ and 
thus probably did not have AD. b, Topographical associations between 
the composite biomarker values and cerebral AB burden as assessed 

by AB-PET imaging. Unadjusted (left) and adjusted (age and APOE4, 
right) regression analyses were performed with the NCGG + AIBL PIB- 
PET data set (n = 232). c, Regression analyses between AB-PET and the 
composite biomarker within APOE4 positive (n = 103) and negative 

(n= 129) individuals from the NCGG + AIBL PIB-PET data set. For 

b and c, brain regions that showed significantly positive correlations 
(FWE corrected P< 0.05) are visualized. d, Scatter plots for the CSF 

AB 1-42 level (n = 46) and plasma biomarker values. e, Scatter plots for 

the CSF AB;_4 level and PET SUVR/BeCKeT values. f, ROC analyses 


coefficient for the overall data set (NCGG + AIBL, all tracers) was 
r=0.678 (P< 0.0001). There were no significant correlations between 
the biomarker values and age or gender in the overall data set but a 
correlation between the composite biomarker and APOE4 (r= 0.464, 
P<0.0001) was observed, and the partial correlation adjusted for 
SUVR/BeCKeT (standardized uptake value ratio, before the centiloid 
kernel transformation!”) was r= 0.247 (P<0.0001). 

To further investigate the topographical associations between the 
plasma biomarkers and brain A8 deposition, we conducted regression 
analyses using SPM8 software (see Methods). The results showed sig- 
nificant and robust correlations between the plasma biomarkers and 
areas of high A® deposition in the brain. The best association was 
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among AB-PET, CSF A{;_4) and the plasma biomarkers. Data are from 46 
individuals. ROC analyses of the plasma composite biomarker and CSF 
AB 1-42 to AB-PET (top). The performances of the composite biomarker 
and CSF A®,_4) are AUC = 83.8% and 87.4%, sensitivity = 80.0% and 
64.0%, specificity = 81.0% and 100%, and accuracy = 80.4% and 80.4%, 
respectively. ROC analyses of the plasma biomarkers to CSF A®)_4, 

using a standard determinant for AB-positivity with a cut-off value of 
544ng1~! (bottom). The composite biomarker showed AUC = 87.6%, 
sensitivity = 100%, specificity = 69.0%, and accuracy = 80.4%. For the 
scatter plots in a, d, and e, the coloured circles represent clinical categories: 
AD (red), MCI (orange) and cognitively normal (blue). Pearson's 
correlation coefficients (r) and their significance (two-sided P) are 
presented in the plots. The vertical dashed lines represent the cut-off 
values of each AB8-PET imaging tracer (a) and CSF A®j-42 (544 ng 1!) (d,e). 
Horizontal dashed lines represent the common cut-off values of the 
plasma biomarkers estimated in Extended Data Fig. 7a, d and of SUVR/ 
BeCKeT (1.4) (e). 


observed when using the composite biomarker in both unadjusted 
and adjusted (age and APOEF4) analyses (Fig. 3b). The topographical 
association patterns were similar both in APOE4-positive and -negative 
sub-group analyses (Fig. 3c). These results demonstrate the strong asso- 
ciation between the plasma biomarkers and A8 deposition in the brain. 

We also analysed the relationships between plasma biomarkers and 
CSF A®,_49. All of the plasma biomarkers, including the plasma A8_4. 
peptide alone, showed significant correlations with CSF A®;_4) concen- 
trations in an AIBL sub-group (n= 46) (Fig. 3d). The composite bio- 
marker demonstrated the highest correlation (r= —0.660, P< 0.0001) 
with CSF A;_4, which is as high as the correlation between the CSF 
AG 1-42 and AB-PET SUVR/BeCKeT (r= —0.698, P< 0.0001) (Fig. 3e). 
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In order to further elucidate the relevance of the three different types of 
A§-related biomarkers, we conducted ROC analyses among AB-PET, 
CSF A$ _42, and plasma biomarkers. If AB-PET is used as the standard 
classifier for AB*/A@B~ status, the plasma composite biomarker and 
CSF A _42 showed identical accuracy (80.4%) with AUCs 83.8% and 
87.4%, respectively (Fig. 3f, upper). Also, if we use the CSF ABj_42 
as the standard classifier, the plasma composite biomarker showed 
87.6% AUC and 80.4% accuracy (Fig. 3f, lower). The performance of 
the plasma AB composite biomarker was comparable to that of CSF 
A8 biomarkers!*-”, These results demonstrate that the three different 
types of AG-related biomarker (plasma and CSF AQ, and PET imaging), 
are highly correlated with each other, clearly indicating that plasma 
AB biomarkers are strongly linked with the AG status of the CNS, but 
less affected by the AB known to be produced in peripheral tissues”’. 

The reasons for the high performance of the plasma A8 assays and 
the reliability of our IP-MS method are discussed in detail in the 
Supplementary Discussion and demonstrated in Extended Data Fig. 5. 
It should be reiterated that our biomarkers are not peptide levels, but 
are the ratios of plasma A8,_4) to the reference peptides APP¢9_7;; and 
AG_40. As these reference peptides have similar amino acid sequences 
and molecular sizes to AB j_4, the large inter-individual variances 
in plasma AQ levels, which are influenced by a wide variety of 
conditions>*4 or anti-A autoantibodies’, should be reduced by the 
use of ratios. Several reports have proposed that the plasma ratio of 
AB 1-49 and AB 4) could be useful as a surrogate for brain AB status, 
although its performance has not been sufficient to allow reliable 
prediction of the individual status of brain A8 burden*®?>. ABj_49 is 
known to aggregate less than A8,_4”°, but neither the nature nor the 
molecular behaviour of APP ¢69_71; is known. Therefore, we performed 
two in vitro experiments, and found that APP¢¢9-711 is a real neuronal 
product (Supplementary Information and Extended Data Fig. 6a) and 
that APP ¢¢9-711 showed much less self-assembly tendency than ABj_4 
(Supplementary Information and Extended Data Fig. 6b-d). 

There were considerable site differences in plasma A$ and biomarker 
levels between the NCGG and AIBL data sets. We speculate that these 
were mainly due to pre-analytic factors, for example, differences in the 
procedures used for plasma processing (Supplementary Discussion). 
These between-site differences may complicate the establishment of 
a common cut-off value, which is essential for the widespread and 
multicentre use of biomarkers. It should be noted that this problem 
is still the biggest issue for CSF biomarkers, as highlighted recently”’, 
and it is proving difficult to solve. To elucidate the influence of these 
between-site differences on biomarker performance, we explored 
optimal common cut-off values applicable for both sites by performing 
additional ROC analyses for the combined data sets (NCGG + AIBL), 
which allowed us to assess the classifying ability under the same 
biomarker levels across the sites. The results demonstrated that the 
biomarker performances were also high in the combined data sets 
(Extended Data Fig. 7a), and that the composite biomarker showed 
the highest classifying ability (Extended Data Fig. 7b). When we applied 
cut-off values determined by the Youden’s index (see Methods) in the 
NCGG + AIBL overall data set, the performances were still very high in 
both the discovery and validation data sets (Extended Data Fig. 7c); for 
example, the composite biomarker showed 87.6% and 87.4% accuracy 
in the NCGG PIB and AIBL PIB data sets, respectively. The relevance 
of the cut-off values for biomarker performances is further visualized 
by the diagnostic performance plots in Extended Data Fig. 7d. These 
results support the stability of the biomarker performances with iden- 
tical cut-off levels between sites. 

Finally, we estimated possible clinical utility of the plasma biomarkers 
in several practical situations. First, we assessed the potential benefit of 
the plasma composite biomarker assuming two specific settings: screen- 
ing for preclinical AD or prodromal AD to identify potential clinical 
trial candidates (see Supplementary Discussion and Extended Data 
Fig. 8a, b). Both scenarios suggest that the plasma biomarker screens 
could reduce unnecessary AB-PET scans, substantially facilitating 
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recruitment for clinical trials. Furthermore, we assessed the potential 
utility of the plasma biomarker in daily clinical practice. When there 
is diagnostic uncertainty about a clinical diagnosis of AD, AB-PET is 
considered to have a major clinical effect, providing diagnostic confi- 
dence or leading to changes in diagnosis”*. In the NCGG data set, there 
were 9 out of 29 (31%) patients who had been diagnosed with AD but 
were PIB-PET negative, and the composite biomarker classified eight 
of them as AG negative; therefore, the plasma biomarkers can also be 
expected to play an important clinical role. To confirm this possibility, 
we conducted an additional study with a new clinical data set consisting 
of 31 AD (22 A8* and 9 AG-, classified by PIB-PET) and 20 non-AD 
(8 AB* and 12 AB-) cases (see Supplementary Discussion and Extended 
Data Fig. 8c, d). The plasma composite biomarker showed 96.7% sensi- 
tivity, 81.0% specificity, and 90.2% accuracy in the overall data (n =51) 
when predicting individual A§ status (AG* or AB~) using the common 
cut-off value (0.376) (Extended Data Fig. 8e-g). The results suggest that 
the plasma biomarker could be helpful for the differential diagnosis of 
AD and aid in determining therapeutic strategies, by providing addi- 
tional information on the brain AB deposition status of individuals. As 
cost-benefit analysis of the use of AB-PET for this purpose has proven 
controversial”’, the impact of the plasma biomarker on daily clinical 
practice could be substantial. 

The findings of the present study are considered to be robust, repro- 
ducible and reliable because biomarker performance was validated in 
a blinded manner using independent data sets (Japan and Australia) 
and involved an established large-scale multicentre cohort (AIBL). 
However, there are still several issues that need to be addressed before 
general clinical application can be considered. First, further valida- 
tion studies (preferably in subjects drawn from primary care settings) 
coupled with longitudinal data will be needed. Second, standardized 
operating procedures for the analytical process as well as the pre- and 
post-analytical steps should be established*°, preferably through an 
international consortium. Under the controlled and standardized oper- 
ating procedures, optimal cut-off values as well as the optimal mathe- 
matical generation of the composite biomarker (see Supplementary 
Discussion and Extended Data Fig. 9a—c) should be established. Third, 
in clinical trials targeting AB reduction, the usefulness of this plasma 
A§ biomarker as a monitoring tool remains to be evaluated. Fourth, 
biomarker performances for the differential diagnosis of other types of 
dementia need to be established. Finally, development of an automated 
assay system to stabilize the analytic factor and to enhance throughput 
of the IP-MS method is underway. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

Subjects. The participants were aged 60 to 90 years. The discovery NCGG data 
set consisted of 62 cognitively normal individuals, 30 with MCI, and 29 with AD 
(121 in total) selected from in-house clinical studies at the NCGG. The AIBL data 
set for external validation consisted of 156 cognitively normal individuals, 68 with 
MCI, and 30 with AD participants (254 in total). 

All participants from the NCGG were native Japanese, recruited from commu- 
nity dwellings and outpatients of the National Hospital for Geriatric Medicine at 
NCGG. The clinical classification of NCGG subjects was determined by following 
the inclusion criteria of the Alzheimer's Disease Neuroimaging Initiative 2 
(ADNI2) study (http://adni.loni.usc.edu/). The definition of the cognitively normal 
group in the NCGG data set is generally equivalent to the cognitively normal 
group in the ADNI2 study. All of the AD and MCI subjects also fulfilled the diag- 
nostic criteria developed by the National Institute on Aging and the Alzheimer’s 
Association (NIA-AA)*)*”, The samples were selected on the basis of age, clinical 
category (cognitively normal, MCI or AD), and data availability for both plasma 
measurements and A3-PET imaging data. Subjects under treatment for any sub- 
stantial medical, neurological, or psychiatric disease, or with any history of a 
major psychiatric disorder, alcohol dependence, or substance dependence, were 
excluded. Individuals with any clinically significant focal brain lesion by MRI were 
also excluded. There were no individuals at the extremes of socio-economic status. 

AIBL is a two-site (Melbourne and Perth), longitudinal cohort study, integrating 
neuroimaging, biomarker, neuropsychometric, and lifestyle data. The AIBL study 
population was selected from English-speaking volunteers who responded to 
media advertisements, or clinical cases that were referred to the study by a network 
of doctors. The AIBL study has strict selection criteria to eliminate, as much as 
possible, comorbidities such as vascular disease and diabetes, but no requirement 
on socio-economic status. Approximately 48% of the AIBL cohort reported more 
than 13 years of education. Clinical classification of the AIBL study was deter- 
mined as previously described!*. The AIBL samples were selected with the same 
conditions as those selected from the NCGG, so that age, sex, and clinical category 
matched. 

In both the NCGG and AIBL data sets, all selected subjects had stored plasma 
samples and corresponding AB-PET imaging data that were acquired within one 
year of plasma sampling. The mean and s.d. of the time discrepancies between 
plasma sampling and PET imaging were 41.1+97.5 and 115.7 + 93.9 days for 
NCGG and AIBL, respectively. 

Both studies were approved by the appropriate institutional ethics committee 
(NCGG Ethics Committee, Japan, and Human Research Ethics Committee, 
Research Governance Unit, St Vincent’s Healthcare, Australia, respectively), 
and were performed following all relevant ethical regulations. Written informed 
consent was obtained from all participants (or their legal guardians) before 
participation. 

From the 254 plasma samples in the total AIBL data set, two outliers were 
excluded from the analyses. One subject’s abnormally high AB signals from IP-MS 
masked the peak of the internal standard which prevented reliable measurements, 
and the other subject showed Af concentrations 9.2-20.5 times higher than the s.d. 
Imaging data. AB-PET imaging for the discovery set in NCGG was performed 
with ''C-PIB (PIB), while AB imaging for the AIBL validation set was performed 
with three different radiotracers: PIB, FLUTE, or FBP. The PET methodology for 
each tracer has been previously described*’. In brief, PET images were spatially 
normalized with CapAIBL using an adaptive atlas™!, and sampled using a preset 
template of narrow cortical regions of interest (ROI). For semi-quantitative 
analysis, a volume of interest template was applied to the summed and spatially 
normalized PET images in order to obtain a standardized uptake value (SUV). 
The images were then scaled to the SUV of each tracer’s recommended reference 
region to generate a tissue ratio termed the SUV ratio (SUVR). A global measure 
of AB burden was computed using the mean SUVR in the frontal, superior parietal, 
lateral temporal, lateral occipital, and anterior and posterior cingulate regions. For 
PIB, the SUVs were normalized to the cerebellar cortex, the whole cerebellum was 
used as the reference region for FBP** while for FLUTE the pons was used as the 
reference region® as advocated by the pharmaceutical companies that supplied 
each tracer. The SUVR was dichotomized as having a high (AS*) or low (AS) 
AB burden, using a cut-off value that was determined for each tracer. Participants 
who underwent PIB were considered to have high A3 when SUVR > 1.40, for 
FLUTE when SUVR > 0.55 and for FBP when SUVR > 1.05. For the analysis across 
different PET tracers, BeCKeT values, which are a linear transformed standardiza- 
tion of FLUTE and FBP SUVR onto ‘PIB-like’ SUVR”, were used. 

Individual MRI with high-resolution 3D T1-weighted, T2-weighted, and 
fluid-attenuated inversion recovery images were acquired for all NCGG and AIBL 
participants. These MRIs were used to exclude subjects who had substantial brain 
lesions. 
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Voxelwise-correlational analyses were performed after spatially normalizing 
and scaling all PET images with CapAIBL™. In brief, a combined plasma-PET 
statistical analysis was performed using SPM8 software, in which the associations 
between plasma biomarkers and AB-PET were estimated using a linear regression 
model within each cohort (discovery, validation, and all PIB combined) and each 
tracer (PIB, FLUTE, FBP, and all combined), using one plasma biomarker at a time 
as the dependent variable and AB-PET as the independent variable. The models 
were further examined after adjusting for age and APOE status. The statistical 
threshold for the voxelwise computations in SPMB8 was set at P< 0.05, using FWE 
to correct for multiple comparisons at a peak level. 

Blood processing and plasma storage. In the NCGG study, blood samples 
were collected between 11:00 and 15:00. Plasma was isolated from whole blood 
collected in 7-ml EDTA-2Na tubes (Venoject II, TERUMO). Within 5 min 
of blood collection, whole blood was centrifuged at 2273g for 5min at room 
temperature. Otherwise, the blood was temporally stored on ice for up to 30 min, 
and then centrifuged. The plasma was immediately transferred to storage tubes 
(48 Jacket Tubes 2.0ml External-Type, FCR&Bio) as 250- or 500-11 aliquots, and 
frozen immediately in a —80°C freezer. The plasma samples were stocked in the 
NCGG biobank, where each was assigned its own ID independent of the study 
ID. The research group did not intervene in sample collection and shipping to 
the Shimadzu Corporation for IP-MS assays was performed in a blinded manner. 

In AIBL, blood samples were collected between 9:00 and 10:00. Plasma was 

isolated from whole blood collected in Sarstedt s-monovette 7.5-ml EDTA tubes 
(Sarstedt) with pre-added Prostaglandin El (PGE1, Sapphire Bioscience) to 
produce a final PGE1 concentration of 33 ng ml”! of whole blood. Processing 
started after bloods had equilibrated with room temperature and within 1 h of 
collection. Whole blood was centrifuged at 200g for 10 min at room temperature 
(no deceleration) to generate a platelet-rich plasma (PRP) layer. The PRP was trans- 
ferred using 3-ml transfer pipettes (Livingstone) to a new 15-ml polypropylene 
centrifuge tube (Greiner Bio-One CELLSTAR). Both the collection tube and 
15-ml tubes were centrifuged at 800g for 15 min at room temperature, maximum 
deceleration. Plasma was combined into a new 15-ml polypropylene tube and 
spun at 3,200g for 30 min to remove debris. Plasma, as 250-1] aliquots, was stored 
in 1-ml capacity NUNC 2D barcoded Bank-IT polypropylene cryovials (NUNC) 
and frozen immediately on dry ice before long term storage in vapour-phase liquid 
nitrogen. For the AIBL samples, following confirmation that the proposed cohort 
had age, gender and APOE4 matching between the clinical groups, a new study 
ID was attached to the blood tubes before shipment to Shimadzu. Researchers 
at Shimadzu were blinded to any associated clinical or PET data until the data 
collected were complete and locked. 
Plasma AB measurements. Plasma AB levels were measured using IP-MS, which 
is an analytical technique that quantifies AB-related peptides of different mass in 
MALDI-TOF mass spectrometry after they have been isolated and enriched from 
abundant plasma proteins by immunoprecipitation using the specific affinity of 
an antibody. This assay was modified from previously reported procedures® with 
two major modifications being made. First, general antibody beads, prepared by 
coupling intact IgG monoclonal antibody 6E10 (BioLegend) directly to Dynabeads 
M-270 Epoxy (Thermo Fisher Scientific) according to the manufacturer’s protocol, 
were used for immunoprecipitation. This method for preparing the antibody beads 
is more simple and practical than the previously reported method because it does 
not require generation and purification of two antigen-binding fragments (F(ab’)) 
from IgGs (clone 6E10 and 4G8) or the coupling of them on beads through PEG. 
Second, the immunoprecipitation procedure was carried out using two rounds of 
repeated processing, which both reduces the non-specific binding of abundant 
proteins that interferes with the signals of the A3-related peptides, and increases 
specificity during the detection of AG-related peptides in MALDI-TOF mass 
spectrometry. 

In detail, 250 11 of plasma was mixed with an equal volume of Tris buffer con- 
taining 10 pM stable-isotope-labelled (SIL) AB j_33 peptide (AnaSpec, San Jose, 
CA), 0.2% w/v n-dodecyl-8-p-maltoside (DDM) and 0.2% w/v n-nonyl-$-p- 
thiomaltoside (NTM). The SIL-A81_3g peptide was used as internal standard for 
normalization of signals for all AB-related peptides in the mass spectrum, which 
was different from other mass spectrometry-based studies”?”** that used syn- 
thetic peptides corresponding to each AG-related peptide as the internal standards 
(for example, SIL-A®y_42 for ABy_42, and SIL-AB1_49 for AB1-49). This was because 
AG, _3g is relatively easy to deal with as it has a lower self-aggregating tendency and 
lower adsorption in storage tubes when compared to A8,_47°° “1. More impor- 
tantly, using a common internal standard has an advantage when computing 
peptide ratios, because it can cancel out any implicit errors related to the amounts 
of added SIL-A{;_3g caused by activities such as production, preparation and/or 
handling. Furthermore, using only one standard peptide is simpler than handling 
three standard peptides, which has cost-benefit implications (see Supplementary 
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Discussion and Extended Data Fig. 5d-f). The nonionic detergents DDM and 
NTM were used for reducing nonspecific binding and obtaining high signals of 
A®-related peptides in MALDI-TOF mass spectrometry. The plasma A$-related 
peptides and internal standard were immunoprecipitated by incubating the anti- 
body beads with the plasma sample for 1h. The bound peptides were washed and 
eluted with glycine buffer (pH 2.8) containing 0.1% w/v DDM. After the pH was 
adjusted to 7.4 with Tris buffer, the immunoprecipitation was repeated once and the 
bound peptides were eluted with 70% acetonitrile containing 5 mM HCL. The eluted 
peptides were applied on four wells of a 900-j1m tFocus MALDI plateTM (Hudson 
Surface Technology) which was prespotted with «-cyano-4-hydroxycinnamic acid 
(CHCA) and methanediphosphonic acid (MDPNA). Mass spectra were acquired 
using a MALDI-linear TOF mass spectrometer (AXIMA Performance, Shimadzu/ 
KRATOS) equipped with a 337-nm nitrogen laser in the positive ion mode. The 
m/z value and signal variability in the mass spectrometer were calibrated externally 
with a mixture of standard peptides to improve the precision of the AB-related 
peptide signal peak. The peak intensities were extracted using Mass++ software v2 
(ref. 42) (Shimadzu). The peptide mass tolerance for quantification was set within 
2.5 Da of the theoretical mass. The limit of detection was established at a signal-to- 
noise ratio of 3:1. One assay produced four mass spectra and the levels of plasma 
A®@-related peptides were obtained by averaging the four spectra normalized with 
SIL-A8}_3g. The normalized intensity was used as plasma AB-related peptide levels. 
The quantitativeness and reliability of the IP-MS assay were carefully validated 
by several steps as detailed in the Supplementary Discussion and Extended Data 
Fig. 5a—c. Using the IP-MS method, we tested linear relationships between the 
normalized signal intensity and the concentration of A8-related peptides in PBS 
containing 3 mg ml! bovine serum albumin, and in human plasma, and ensured 
the reliability of quantification. For example, we analysed the dose dependency of 
the normalized intensity for each of three synthetic peptides (ABj_42, AB-40, and 
APP¢69-711) that had been spiked into the human plasma. The results showed very 
good linearity for each peptide with coefficients of determination (R?) between 
0.999 and 1.000 (Extended Data Fig. 5c). These R? values are as high as, or even 
higher than, those reported in a mass spectrometry-based study that used SIL- 
AB\-42 and SIL-A®j_4o as internal standards for each corresponding peptide meas- 
ured in CSF*”. Each peptide is ionized differently during mass spectrometry; our 
standard curves show different slopes as previously reported*’, but this does 
not affect the robustness and reproducibility of quantification. We also verified 
the reproducibility of the assay using human EDTA plasma (Tennessee Blood 
Services). The intra- and inter-day assay coefficients of variance obtained for 
AGB 1-49 were 4.2-4.7% (n= 5) and 3.2-6.8% (n = 3), respectively; for AB1-42 the 
coefficients were 6.8-7.8% and 1.6-7.7%, respectively, and for APP¢69-71; the coef- 
ficients were 2.9-8.2% and 4.7-10.7%, respectively. These values are smaller than 
those obtained for within-laboratory CSF biomarker assays“? (5% to 19%), sup- 
porting the reliability of our measurements. The IP-MS method can also measure 
other forms of plasma A$ such as A31_3g and AB;_39, but we did not focus on them 
in this study. 

CSF biomarker measurements. In the AIBL data set, 46 subjects underwent 
CSF testing within two months of blood sampling and AB-PET imaging. The 
procedures for CSF sampling and biomarker measurements were performed as 
previously described". In this study, we focused on analysing CSF A®,_. values, 
which were measured by ELISA”). The cut off value of CSF A(j_42 was 544ng1-! 
(ref. 21, below which A®,_42 was considered abnormal). 

Sample size considerations. The power calculations for sample sizes in the study 
were estimated as follows: assuming that the biomarker candidates could be used 
to classify individuals as A3* or AS” with a sensitivity of >80% and a theoretical 
sensitivity of 50%, the sample size required to achieve a statistical power of 80% 
at a 5% significance level would be 20 and 20 for both A8* and AB~ groups. Also, 
assuming that the plasma biomarkers could show more than a 0.5 correlation 
coefficient (r) to A3-PET SUVR values or to CSF biomarker values, a total sample 
size of 21 would be required to achieve a statistical power of 80% at a 5% signifi- 
cance level. Both the NCGG and AIBL data sets, including the subpopulation with 
CSF data, satisfied these sample size requirements. 

Data analyses. Data analyses were performed in a blind and independent 
manner. The plasma-A$ measurements were performed at Koichi Tanaka Mass 
Spectrometry Research Laboratory (Shimadzu) without any clinical or imaging 
information. All of the PET imaging data were analysed by the AIBL imaging group. 
The A8-PET dichotomization (A3*/A8") and generation of SUVR were performed 
without any clinical or biomarker information. The NCGG group conducted statis- 
tical analyses, and all results were confirmed by two independent biostatisticians. 
The test biomarker values were generated by computing the ratio of normalized 
intensity of ABy_42 with APP¢¢9-711, and ABi_49. We used ABy_49 as the denominator, 
because these ratios (APP¢¢69_71;/A61-42 and AB,_49/A8 1-42) showed a normal distri- 
bution without any transformation in both the NCGG and AIBL data sets (Shapiro- 


Wilk test“), whereas using A@j_42 value as the numerator did not (Extended Data 
Fig. 2b). The composite biomarker was generated by combining the normalized 
scores of APP¢69-711/AG1-42 and AB1_49/A31_4 as follows: first, the discovery NCGG 
data set was used for a standard database, and values of APP¢69-71;/A81-42 and 
A81-49/A81-42 in all data sets (both NCGG and AIBL) were normalized to z-scores 
using the mean (0.774 and 24.72, respectively) and s.d. (0.191 and 4.31, respec- 
tively) of the NCGG data; then, z-scores of APP¢69-711/AG1-42 and ABy_49/A81-42 
were averaged for each subject and used as a value for the composite biomarker 
so that each biomarker contributed equally to the composite. Before the main 
analyses, the weight of the z-score composition was pre-determined as 1:1 by 
exploratory analyses at NCGG that were confirmed by a pilot study. 

Statistical analyses were performed using R v.3.3.2. SPSS v.21 (IBM), and JMP 
software v.8 (SAS Institute). For categorical data, such as gender, clinical category 
and APOE4 carrier distributions, group differences were analysed using the x? test. 
For numerical data, group differences were analysed by Student's t-test or Welch's 
t-test, and the effect size was assessed using Cohen's d. 

The biomarker performance when predicting A3*/AB~ status was assessed 
using ROC analyses. The AUC, and the representative best values for the sensitivity, 
specificity and accuracy at an optimal cut-off point, were used for the perfor- 
mance measures. The cut-off points were determined by Youden’s index*, which 
optimizes biomarker performance when equal weight is given to sensitivity and 
specificity. In addition, positive predictive value (PPV) and negative predictive 
value (NPV) were estimated by assuming the prevalence of A3* individuals in 
specific settings. These values were computed as follows: 

Where TP = true positive, TN = true negative, FP = false positive, and 
FN = false negative; sensitivity = TP/(FN+TP), specificity = TN/(TN+FP), 
accuracy = (TP+TN)/(TP+TN+FP+EN), PPV =1/(1 + ((1 — prevalence)/ 
prevalence)((1 — specificity)/sensitivity)), and NPV = 1/(1 + (prevalence/ 
(1 — prevalence))((1-sensitivity)/specificity)). 

We performed both unadjusted and adjusted ROC analyses. In unadjusted ROC 
analyses, original biomarker values from the discovery (NCGG) and validation 
(AIBL) data were used. In adjusted ROC analyses, a predictive formula including 
confounders (for example, age, gender, APOE4 and clinical category) was built 
using a generalized linear model (GLM) (binominal logistic regression analysis) 
on the discovery NCGG data as follows: 


T= 1/( fe 7 (ot Bienit...+ BxecK)) 


Then the same formula and the same coefficients were applied to both the 
NCGG discovery and AIBL validation data to calculate the fitted predicted prob- 
abilities. These predictive values were used for the adjusted ROC analyses of the 
NCGG and AIBL data. 

To compare the biomarker performances among A®j_42, APP¢69-711/AB1_42, 
AB4-40/AB)_49, and the composite biomarker within the same data set, the differ- 
ences between pairs of AUCs were statistically analysed using the DeLong test**. 
The P values were Bonferroni-corrected by multiplying the P values by the number 
of comparisons (6) to control for the multiple comparisons problem. Improvement 
in the predictive ability of an alternative model was also assessed using the 
categorical NRI and IDI in the logistic regression model”. For the categorical NRI, 
the reclassification ability was measured in four categories using the first, second 
and third quantiles of the original model’s fitted values as cut points. Statistical 
differences in AUCs between two different data sets were analysed using Delong’s 
test for two uncorrelated ROC curves*®. 

Pearson product-moment correlational analysis was conducted to evaluate 
the strength of the association between each plasma biomarker and cortical-A3 
deposition assessed by either A3-PET imaging or CSF A® values. All the tests were 
two-tailed, and the significance level of difference was set at P< 0.05. 

Data availability. Source Data for graphs plotted in Figs 1-3 and Extended Data 
Figs 1-9 are available in the online version of this paper. All other data are available 
from the corresponding author upon reasonable request. 


31. Albert, M. S. et a/. The diagnosis of mild cognitive impairment due to 
Alzheimer’s disease: recommendations from the National Institute on 
Aging-Alzheimer’s Association workgroups on diagnostic guidelines for 
Alzheimer’s disease. Alzheimers Dement. 7, 270-279 (2011). 

32. McKhann, G. M. et a/. The diagnosis of dementia due to Alzheimer’s disease: 
recommendations from the National Institute on Aging-Alzheimer’s 
Association workgroups on diagnostic guidelines for Alzheimer’s disease. 
Alzheimers Dement. 7, 263-269 (2011). 

33. Rowe, C. C. et al. Amyloid imaging results from the Australian Imaging, 
Biomarkers and Lifestyle (AIBL) study of aging. Neurobiol. Aging 31, 
1275-1283 (2010). 

34. Bourgeat, P. et al. Comparison of MR-less PiB SUVR quantification methods. 
Neurobiol. Aging 36, S159-S166 (2015). 


© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


35. 


36. 


37. 


38. 


39. 
40. 


Clark, C. M. et a/. Use of florbetapir-PET for imaging B-amyloid pathology. 

J. Am. Med. Assoc. 305, 275-283 (2011). 

Lundavist, R. et a/. Implementation and validation of an adaptive template 
registration method for !8F-flutemetamol imaging data. J. Nucl. Med. 54, 
1472-1478 (2013). 

Pannee, J. et a/. A selected reaction monitoring (SRM)-based method for 
absolute quantification of A838, A840, and A842 in cerebrospinal fluid of 
Alzheimer’s disease patients and healthy controls. J. Alzheimers Dis. 33, 
1021-1032 (2013). 

Patterson, B. W. et al. Age and amyloid effects on human central nervous 
system amyloid-beta kinetics. Ann. Neurol. 78, 439-453 (2015). 

Manzoni, C. et al. Overcoming synthetic A3 peptide aging: a new approach to 
an age-old problem. Amyloid 16, 71-80 (2009). 

Schlenzig, D. et a/. N-Terminal pyroglutamate formation of A338 and A840 
enforces oligomer formation and potency to disrupt hippocampal long-term 
potentiation. J. Neurochem. 121, 774-784 (2012). 


LETTER 


. Toombs, J., Paterson, R. W., Schott, J. M. & Zetterberg, H. Amyloid-beta 42 


adsorption following serial tube transfer. Alzheimers Res. Ther. 6, 5 
(2014). 


. Tanaka, S. et al. Mass++: A visualization and analysis tool for mass 


spectrometry. J. Proteome Res. 13, 3846-3853 (2014). 


. Mattsson, N. et al. CSF biomarker variability in the Alzheimer’s Association 


quality control program. Alzheimers Dement. 9, 251-261 (2013). 


. Shapiro, S. S. & Wilk, M. B. An analysis of variance test for normality (complete 


samples). Biometrika 52, 591-611 (1965). 


. Youden, W. J. Index for rating diagnostic tests. Cancer 3, 32-35 (1950). 
. DeLong, E. R., DeLong, D. M. & Clarke-Pearson, D. L. Comparing the areas 


under two or more correlated receiver operating characteristic curves: 
a nonparametric approach. Biometrics 44, 837-845 (1988). 


. Pencina, M.J., D'Agostino, R.B. Sr. D’Agostino, R.B. Jr. & Vasan, R.S. Evaluating 


the added predictive ability of a new marker: from area under the ROC curve 
to reclassification and beyond. Stat. Med. 27, 157-172 (2008). 


© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


LETTER 


a 
Amyloid precursor protein 
(APP) 
AB. 40 
AB, H 
APP 5g6:711 a 
b 
1.0 
0.8 
2 
2 06 
n 
= 
o 
2 04 
APP 6711! B22 
0.2 = AB o/AB 12 


— Composite biomarker 


0 02 04 06 08 1.0 
1-specificity 


Extended Data Figure 1 | The amino acid sequences of AG-related peptides 
and results of the pilot study. a, Overview of the amino acid sequences of 
the A6-related peptides AB,_49, AB 1-42 and APP¢69_711. b, ROC analyses of 

the blinded pilot study for 20 AB* and 20 AB~ subjects (see Supplementary 
Information). The green, blue, and red curves indicate APP¢69-711/AG1-42, 

AB -40/AG 1-42 and the composite biomarker, respectively. The AUCs and the 
representative best values of sensitivity, specificity and accuracy for these 
biomarkers as determined by Youden’s index are as follows: APP ¢69-711/AB1_42, 
AUC = 0.923, sensitivity = 0.850, specificity = 0.950, accuracy = 0.900; 
A®y_40/A81_42, AUC = 0.930, sensitivity = 0.900, specificity = 0.900, 

accuracy = 0.900; composite biomarker, AUC = 0.975, sensitivity = 0.950, 
specificity = 0.950, accuracy = 0.950. 
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a The AB-related peptide and biomarker values in each study sites (NCGG and AIBL) 
Peptides APP 669-711 AB, 10 AB, 42 
NCGG AIBL NCGG AIBL NCGG AIBL 
overall 0.271 0.278 8.87 8.332 0.373 0.316° 
(95% Cl) (0.261, 0.282) (0.271, 0.285) (8.50, 9.24) (8.10, 8.57) (0.352, 0.393) (0.307, 0.326) 
Apt 0.263 0.279 8.28 8.28 0.289 0.291 
(95% Cl) (0.246, 0.281) (0.270, 0.289) (7.61, 8.94) (8.01, 8.55) (0.264, 0.314) (0.281, 0.301) 
AB 0.277 0.277 9.29 8.39° 0.431 0.347° 
(95% Cl) (0.265, 0.289) (0.265, 0.288) (8.88, 9.69) (7.99, 8.79) (0.410, 0.453) (0.332, 0.362) 
P-value 0.174 0.708 0.011 0.651 < 0.0001 < 0.0001 
Cohen's d 0.252 0.047 0.508 0.059 1.570 0.791 
Biomarkers APP 569714/ABy 42 AB, 4o/AB, 42 Composite biomarker 
NCGG AIBL NCGG AIBL NCGG AIBL 
overall 0.774 0.896 24.72 26.70° 0 0.548° 
(95% Cl) (0.739, 0.808) (0.877, 0.914) (23.95, 25.50) (26.21, 27.19) | (-0.170, 0.170) (0.457, 0.640) 
AB* 0.934 0.971 28.84 28.69 0.896 0.975 
(95% Cl) (0.896, 0.972) (0.948, 0.993) (28.19, 29.49) (28.13, 29.25) (0.755, 1.037) (0.871, 1.080) 
AB 0.661 0.807 21.82 24.33° -0.631 0.040° 
(95% Cl) (0.629, 0.694) (0.786, 0.827) (21.17, 22.48) (23.72, 24.93) (-0.776, -0.486)  (-0.055, 0.135) 
P-value < 0.0001 < 0.0001 < 0.0001 < 0.0001 < 0.0001 < 0.0001 
Cohen's d 2.007 1.335 2.730 1.319 2.686 1.630 
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Extended Data Figure 2 | Peptide and biomarker values, and their 
distributions. a, The A6-related peptide and biomarker values in each 
study site (NCGG and AIBL). Normalized intensity of each peptide (top), 
and values for each biomarker (bottom) in the NCGG (n= 121) and AIBL 
overall (n = 252) data sets. Composite biomarker values are the average 

of the normalized values of APP¢69-711/A81-42 and ABy_490/A81-42. Peptide 
values are arbitrary units. Means and 95% confidence intervals (CI) in 
parentheses; P values show statistical differences between the AB* and 
AB groups (two-sided Student's t-test or Welch’s t-test). Superscripts 


474013 1810 9 2898 34 
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indicate statistically significant site differences (*P =0.012, bp—0,002, 
“P= <0.0001, two-sided). These site differences did not change when 
using analysis of covariance (ANCOVA) adjusted for semi-quantitative 
measures of AB-PET, using SUVR (PIB) and BeCKeT (FLUTE and FBP) 
values. b, Histograms of the biomarker value distributions for APP¢69_711/ 
A®B1-42, ABy-40/AB 1-42 (top), and their inversions (AB _42/APP¢¢9_71; and 
A®Bj-42/A1_40) (bottom). Blue and red bars represent the distributions of 
A8~ and A8* populations, respectively. P values represent the results of 
the Shapiro-Wilk test (see Methods). 
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Extended Data Figure 3 | Adjusted ROC analyses corresponding to 

Fig. 2. a, ROC analyses for each biomarker when predicting individual 
AG*/A{~ status for the discovery, validation, and combined data sets. 
Adjusted (age, gender, APOE4, and clinical category) analyses for the 
NCGG PIB discovery data (left), AIBL PIB validation data (middle), 

and AIBL overall (all tracers) validation data (right). See Extended Data 
Table 1a for detailed performance values. Data are from 121, 111 and 

252 individuals for the NCGG PIB, AIBL PIB and AIBL overall data, 
respectively. b, Comparisons of biomarker performances within each 
analysis corresponding to the ROC curves in a. Each colour bar represents 
the AUC and 95% confidence interval. Statistically significant differences 
between two AUCs (DeLong test) and significant increments in predictive 
ability as assessed by NRI and IDI are indicated as in Fig. 2. All P values 
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are two-sided, and Bonferroni corrected (multiplied by the number of 
comparisons, 6). Note that the NRI in the comparison between A8,_40/ 
A®,_4. and the composite biomarker in NCGG data was negative 

(NRI = —0.382) indicating that the reclassification ability is lower in 
the composite biomarker. c, Adjusted (age, gender, APOE4 and clinical 
category) ROC analyses of the composite biomarker compared by different 
PET tracers; PIB (NCGG, n= 121, and AIBL, n= 111), flutemetamol 
(AIBL, n = 81), and florbetapir (AIBL, n = 60). See Extended Data 
Table 1b for detailed performance values. d, e, Adjusted (age, gender, 
APOE4) ROC curves of the composite biomarker within the AD and 
MCI (d) and cognitively normal (e) groups. Sample sizes are the same 
as those listed in Fig. 2d, e; see Extended Data Table Ic for detailed 
performance values. 
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Extended Data Figure 4 | Correlations between plasma biomarkers 
and brain A® burden: additional data related to Fig. 3a. a—c, Biomarker 
values plotted against SUVR values from PIB-PET imaging; A@}-4 (a), 
APP ¢69-711/AB1-42 (b) and ABy_42/AB 1-40 (c). Data are from 121 (NCGG 
PIB, top) and 111 (AIBL PIB, bottom) individuals. Colours represent the 
clinical categories: AD, red; MCI, orange; cognitively normal, blue. The 
vertical dashed lines represent cut-off values of PIB-PET imaging (1.4), 
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and horizontal dashed lines represent the common cut-off values of the 
plasma biomarkers estimated in Extended Data Fig. 7. d, A summary table 
for the correlation analyses. The sample sizes for each data set are; NCGG 
PIB, n= 121; AIBL PIB, n=111; NCGG + AIBL PIB, n= 232; AIBL 
FLUTE, n=81; AIBL FBP, n= 60; and NCGG + AIBL overall, n = 373. 
Pearson's correlation coefficients (r) and their significance (two-sided P) 
are presented in the plots and the table. 
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Extended Data Figure 5 | See next page for caption. 
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Extended Data Figure 5 | Reliability of the IP-MS methods. a, Standard 
curves of ABy-42 (left), AB 1-40 (middle), and APP¢69-711 (right) in PBS 
containing BSA. The standard curves were generated over a 2.5-40 pM 
range for ABy_42 and APP 669-7115 and a 10-160 pM range for AB _40. The 
linearity was evaluated with the coefficient of determination (R’). The 
error bars indicate the standard deviations of normalized intensities 
obtained from four mass spectra. The normalized intensities (AU) and 
signal-to-noise ratios at the lowest concentration were 0.119 AU and 10.9 
for ABy_42, 0.152 AU and 16.1 for APP 669-711 and 1.56 AU and 165 for 

AG _40, respectively. The lower limit of quantification referred to the lowest 
concentration at which A®j-42, APP¢69-711 and ABj_49 showed a signal-to- 
noise ratio greater than 10. Data reproducibility was confirmed by two 
additional experiments. b, Relationships between plasma dilution and 
normalized intensity of endogenous A842, AB 1-49, and APP¢69-711, which 
were contained in the human plasma. Normalized intensity indicates 

the mass spectrometry signal normalized with the signal for SIL-AB;_3s. 
The linearity was evaluated with R*. The error bars indicate the standard 
deviations of normalized intensities obtained from four mass spectra. The 
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data reproducibility was confirmed by one additional experiment. 

c, Signal linearity of plasma peptides spiked with synthetic ABj_., 
synthetic AB,_49, and synthetic APP¢9_71. The plasma samples, which 
were spiked over a 2.5-40 pM range for ABy_42 and APP¢69_711, and a 
10-160 pM range for AB,-49, were prepared and measured by the IP-MS 
method. The linearity was evaluated with R. The error bars indicate 

the standard deviations of normalized intensities obtained from four 
mass spectra. The data reproducibility was confirmed by one additional 
experiment. d, Normalized signal intensity of AB;_42, ABy-49, and AB1-40/ 
AG _42 in 19 subjects measured by two methods; using common internal 
standard SIL-A1_3g (x axis) and using corresponding SIL-peptides 

(y axis). Pearson's correlation coefficients (r) and their significance 
(two-sided P) are presented in the plots. The experiments were performed 
once. e, ROC analyses for ABy_42 and AB1~40/A1-42 to distinguish between 
A®B* and AG~ individuals (n= 19) of the two methods; using the common 
internal standard SIL-A _3s (left) and using the corresponding SIL- 
peptides (right). f, Tables showing the performance values corresponding 
to e, as determined by ROC analyses and Youden’s index. 
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Extended Data Figure 6 | Cellular and molecular characteristics 

of APP¢69-711- a, Results of additional experiment 1 (Supplementary 
Information). A8-related peptides produced from human neuroblastoma 
cell line. MALDI-TOF mass spectra of A6-related peptides in human 
plasma (top), BE(2)-C cell culture supernatant (middle) and medium 
without cell culture (bottom). Representative spectra from five 
experiments are shown. The theoretical m/z values of peptides are 4,330.9 
for AB 1-40, 4,515.1 for AB1_42, and 4,689.4 for APP¢69-711- SIL-A8 1-38 

was used as an internal standard for the normalization of mass spectra. 
b-d, Results of additional experiment 2 (Supplementary Information). 

b, Kinetics of fibril formation. AB;_42 (151M, open circles) or APP¢69-711 
(15M, closed circles) were incubated in PBS at 37°C. Fibril formation 


Wavenumber (nm) 


was monitored using the thioflavin T spectroscopic assay. Data are 

mean +s.d. from four (A®,_49) or five (APP¢69-711) experiments. 

c, Size exclusion chromatography. A®1-42 (151M, left) or APP¢69-711 
(151M, right) were incubated in PBS at 37 °C, and the supernatants 
were centrifuged (10,000g for 10 min) and subjected to size-exclusion 
chromatography (Sephacryl S-300 HR) at 0 (black), 3 (red), 6 (blue), 

12 (orange), and 24h (green). The elution times for molecular mass 
standards (kDa) are indicated by arrows. d, Changes in secondary 
structure during peptide aggregation. ABj_42 (15 1M, left) and APP 669-711 
(151M, right) were incubated in PBS at 37 °C, and circular dichroism 
spectra were measured at 0 (black), 3 (red), 6 (blue), 12 (orange), and 24h 
(green). Experiments in b-d were each performed once. 
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Extended Data Figure 7 | See next page for caption. 
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Extended Data Figure 7 | Common cut-off values are applicable 

for both NCGG and AIBL data sets. a, Unadjusted ROC analyses for 
each biomarker when predicting individual A8*/A8~ status for the 
NCGG + AIBL PIB (left, n = 232) and NCGG + AIBL overall (right, 
n= 373) data sets. b, Comparisons of biomarker performances within 
each analysis corresponding to the ROC curves in a. Each colour 

bar represents the AUC and 95% confidence. Statistically significant 
differences between two AUCs (DeLong test) and significant increments 
in predictive ability as assessed by NRI and IDI are indicated as in Fig. 2. 
All P values are two-sided, and Bonferroni corrected (multiplied by the 
number of comparisons, 6). c, Biomarker performances when applying 
the same cut-off values to each data set. For each biomarker, an optimal 


common cut-off value was determined by the Youden’s index of the ROC 
analysis for the NCGG + AIBL overall data set. The sensitivity, specificity 
and accuracy were then calculated at the common cut-off point for each 
biomarker in all data sets. d, Diagnostic performance plots for AB,_42, 
APP¢69-711/A81-42, AB1-40/AB1-42 and the composite biomarker. Each row 
from top to bottom shows the plots for the NCGG PIB, AIBL PIB, AIBL 
overall, NCGG + AIBL PIB, and NCGG + AIBL overall data (unadjusted), 
respectively. Sensitivity (orange), specificity (blue) and accuracy (green) 
were plotted using the values of the biomarkers (x axis). The vertical 
dashed lines indicate the common cut-off values as shown in c. The 

blue and pink shaded squares indicate ranges in which a biomarker can 
perform with at least 80% and 85% accuracy, respectively. 
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total Ag* AB 
AD 31 22 9 
other types of dementing disorders (non-AD) 20 8 12 
overall 51 30 21 
d= Clinical diagnosis of non-AD cases 
total Ag* AB 
CBS (corticobasal syndrome) 7 2 5 
PSP (progressive supranuclear palsy) 1 0 1 
SD (semantic dementia) 3 1 2 
PNFA (progressive nonfluent aphasia) 2 0 2 
DLB (dementia with Lewy bodies) 3 2 1 
PDD (parkinson's disease dementia) 1 1 0 
VCI (vascular cognitive impairment) 1 0 1 
CAA (cerebral amyloid angiopathy) 1 1 0 
iNPH (idiopathic normal pressure hydrocephalus) 4 1 0 
e ROC analysis to overall data f Performance of the composite biomarker g 
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Extended Data Figure 8 | Possible clinical utility of the plasma 
biomarker. a, b, Diagnostic performance plots for the composite 
biomarker in two specific settings (see Supplementary Discussion). 

a, Diagnostic performance plots for subjects with MCI in the AIBL 

PIB unadjusted data (n = 33) (left). The prevalence of A8-positivity for 
subjects with MCI was assumed to be 66%. Sensitivity (orange), specificity 
(blue), accuracy (green), PPV (red, dashed) and NPV (dark blue, dashed) 
were plotted against the values of the composite biomarker (x axis). 

The black vertical dashed line indicates a cut-off point as determined 

by the Youden’s index (y point) in the AIBL PIB data. At the y point, 

the sensitivity and specificity were 0.900 and 0.923, respectively. With 
these values, relationships between the prevalence and PPV or NPV 
were plotted (right). Note that these data do not correspond to the ROC 
analysis shown in Fig. 2d, because this diagnostic performance plot 
analysis does not contain subjects with AD. b, Diagnostic performance 


plots for cognitively normal subjects in the AIBL PIB unadjusted data 
(n= 63) (left). The prevalence of AB-positivity in general elderly people 
was assumed to be 30%. At the y point, the sensitivity and specificity were 
0.880 and 0.868, respectively. With these values, relationships between the 
prevalence and PPV or NPV were also plotted (right). c-g, Results of the 
additional analysis for subjects with and without AD (see Supplementary 
Discussion). c, Sample numbers for subjects with and without AD. 

d, Detailed clinical diagnoses of subjects without AD. e, ROC analyses 

for each plasma biomarker in the overall data (n = 51). f, ROC analysis 

of the composite biomarker in the overall (1 =51), AD (n=31), and 
non-AD (n= 20) groups. g, Performance of the composite biomarker 
using the common cut-off value. The AUC values were computed from the 
ROC analysis and sensitivity, specificity, and accuracy were computed by 
applying the common cut-off value for the composite biomarker (0.376). 
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original weight | NCGG weight AIBL weight original weight | NCGG weight AIBL weight 
1:1 1.14:3.59 3.04:1.95 1:1 1.14:3.59 3.04:1.95 
AUC 0.967 0.969 0.962 0.941 0.919 0.943 
(95% Cl) (0.942, 0.992) (0.945, 0.993) (0.935, 0.990) (0.897, 0.984) (0.865, 0.973) (0.902, 0.983) 
Sensitivity 1.000 1.000 0.960 0.917 0.833 0.933 
Specificity 0.845 0.859 0.845 0.843 0.863 0.824 
Accuracy 0.909 0.917 0.893 0.883 0.847 0.883 


Extended Data Figure 9 | Optimal generation of the composite 
biomarker. Unadjusted ROC analyses of the composite biomarkers 
generated by different weightings (see Supplementary Discussion). 
a, Comparisons of the composite biomarkers generated by different 
weights for APP¢69-711/A81_42 and A®1_40/A3)_42 normalized values 


(z-scores) in the discovery NCGG PIB data (n = 121, left) and validation 


AIBL PIB data (n= 111, right). The composite biomarker generated 
the original weight (1:1) is coloured red, the weight estimated by the 


by 


NCGG data (1.14:3.59) green, and the weight estimated by the AIBL data 
(3.04:1.95) blue. b, A comparison of the composite biomarkers generated 
by using different reference databases. The original composite biomarker 
(normalized by NCGG data) is coloured red, and the alternative composite 
biomarker normalized by AIBL data is orange. c, Summary of the ROC 
analyses. The AUCs and the representative best values of sensitivity, 
specificity and accuracy for these biomarkers as determined by the 
Youden’s index are shown. 
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Extended Data Table 1 | Performance values of the plasma biomarkers 
a_ Performances of each biomarker as analyzed by the ROC analysis 
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Unadjusted Adjusted 
Discovery Validation Discovery Validation 
NCGG AIBL AIBL NCGG AIBL AIBL 
PIB PIB overall PIB PIB overall 
AB, 42 AUC 0.872 0.757 0.718 0.913 0.812 0.797 
(95% Cl) (0.804, 0.94) (0.667, 0.847) (0.655, 0.781) (0.853, 0.973) (0.733, 0.891) (0.743, 0.851) 
Sensitivity 0.740 0.783 0.635 0.820 0.800 0.774 
Specificity 0.887 0.667 0.739 0.958 0.706 0.713 
Accuracy 0.826 0.730 0.683 0.901 0.757 0.746 
Cut-off 0.324 0.328 0.295 0.533 0.512 0.517 
APP 569.711/AB, 40 AUC 0.923 0.895 0.828 0.933 0.905 0.854 
(95% Cl) (0.878, 0.967) (0.839, 0.951) (0.778, 0.878) (0.893, 0.973) (0.852, 0.958) (0.808, 0.900) 
Sensitivity 0.900 0.850 0.832 0.760 0.900 0.766 
Specificity 0.803 0.784 0.696 0.944 0.765 0.817 
Accuracy 0.843 0.820 0.770 0.868 0.838 0.790 
Cut-off 0.779 0.861 0.849 0.651 0.443 0.571 
AB, ao! ABs 40 AUC 0.967 0.889 0.837 0.979 0.897 0.851 
(95% Cl) (0.942, 0.992) (0.825, 0.952) (0.787, 0.887) (0.961, 0.997) (0.837, 0.958) (0.804, 0.898) 
Sensitivity 0.960 0.733 0.657 0.900 0.833 0.745 
Specificity 0.873 0.922 0.896 0.944 0.843 0.826 
Accuracy 0.909 0.820 0.766 0.926 0.838 0.782 
Cut-off 25.469 27.656 27.723 0.550 0.528 0.560 
Composite AUC 0.967 0.941 0.883 0.974 0.940 0.888 
biomarker (95% Cl) (0.942, 0.992) (0.897, 0.984) (0.840, 0.926) (0.953, 0.995) (0.898, 0.982) (0.847, 0.929) 
Sensitivity 1.000 0.917 0.854 0.920 0.833 0.876 
Specificity 0.845 0.843 0.800 0.915 0.922 0.774 
Accuracy 0.909 0.883 0.829 0.917 0.874 0.829 
Cut-off -0.079 0.425 0.425 0.407 0.663 0.466 
b Performance of the composite biomarker for each PET tracer 
Unadjusted Adjusted 
Discovery Validation Discovery Validation 
NCGG PIB AIBL PIB AIBL FLUTE AIBL FBP NCGG PIB AIBL PIB AIBL FLUTE AIBL FBP 
AUC 0.967 0.941 0.829 0.864 0.974 0.940 0.849 0.850 
(95% Cl) (0.942, 0.992) (0.897, 0.984) (0.737, 0.921) (0.772, 0.957) (0.953, 0.995) (0.898, 0.982) (0.764, 0.934) (0.750, 0.950) 
Sensitivity 1.000 0.917 0.787 1.000 0.920 0.833 0.809 1.000 
Specificity 0.845 0.843 0.824 0.633 0.915 0.922 0.794 0.700 
Accuracy 0.909 0.883 0.802 0.817 0.917 0.874 0.802 0.850 
Cut-off -0.079 0.425 0.491 0.128 0.407 0.663 0.570 0.290 
Cc Performance of the composite biomarker within each clinical category 
Unadjusted Adjusted 
Within AD and MCI Within AD and MCI 
PIB NCGG PIB AIBL ‘8F AB tracers PIB NCGG PIB AIBL ‘8F AB tracers 
AUC 0.980 0.974 0.894 0.983 0.978 0.905 
(95% Cl) (0.953, 1.000) (0.937, 1.000) (0.799, 0.989) (0.959, 1.000) (0.945, 1.000) (0.818, 0.992) 
Sensitivity 0.975 0.914 0.944 0.875 0.886 0.944 
Specificity 0.895 0.923 0.750 1.000 1.000 0.750 
Accuracy 0.949 0.917 0.896 0.915 0.917 0.896 
Cut-off 0.192 0.610 0.124 0.868 0.863 0.549 
Within CN Within CN 
PIB NCGG PIB AIBL '8F AB tracers PIB NCGG PIB AIBL ‘8F AB tracers 
AUC 0.912 0.917 0.800 0.942 0.873 0.786 
(95% Cl) (0.840, 0.984) (0.849, 0.985) (0.705, 0.895) (0.886, 0.999) (0.787, 0.958) (0.692, 0.879) 
Sensitiviy 1.000 0.880 0.780 1.000 0.840 0.780 
Specificity 0.865 0.868 0.808 0.827 0.816 0.731 
Accuracy 0.887 0.873 0.796 0.855 0.825 0.753 
Cut-off -0.112 0.425 0.491 0.130 0.355 0.402 


a, Performances of each biomarker as analysed by the ROC analyses corresponding to Fig. 2a (unadjusted analysis, left) and Extended Data Fig. 3a (adjusted analysis, right). All values except AUC are 
the representative best values for each ROC analysis at a cut-off point determined using Youden’s index (see Methods). The cut-off values for the adjusted analyses are predictive values of the logistic 
regression analyses. b, Performance of the composite biomarker for each PET tracer. The left and right panels correspond to the results of Fig. 2c (unadjusted) and Extended Data Fig. 3c (adjusted), 
respectively. c, The performance of the composite biomarker within each clinical category. The left panel (unadjusted) corresponds to the results of Fig. 2d (top) and Fig. 2e (bottom). The right panel 
(adjusted) corresponds to the results of Extended Data Fig. 3d (top) and Extended Data Fig. 3e (bottom). 
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Innate and adaptive lymphocytes sequentially shape 
the gut microbiota and lipid metabolism 


Kairui Mao!, Antonio P. Baptista’, Samira Tamoutounour?, Lenan Zhuang’, Nicolas Bouladoux2, Andrew J. Martins’, 
Yuefeng Huang?, Michael Y. Gerner®, Yasmine Belkaid*? & Ronald N. Germain! 


The mammalian gut is colonized by numerous microorganisms 
collectively termed the microbiota, which have a mutually beneficial 
relationship with their host!->. Normally, the gut microbiota matures 
during ontogeny to a state of balanced commensalism marked by the 
absence of adverse inflammation*». Subsets of innate lymphoid cells 
(ILCs) and conventional T cells are considered to have redundant 
functions in containment and clearance of microbial pathogens®’, 
but how these two major lymphoid-cell populations each contribute 
to shaping the mature commensal microbiome and help to maintain 
tissue homeostasis has not been determined. Here we identify, using 
advanced multiplex quantitative imaging methods, an extensive 
and persistent phosphorylated-STATS3 signature in group 3 ILCs 
and intestinal epithelial cells that is induced by interleukin (IL)-23 
and IL-22 in mice that lack CD4* T cells. By contrast, in immune- 
competent mice, phosphorylated-STAT3 activation is induced 
only transiently by microbial colonization at weaning. This early 
signature is extinguished as CD4t T cell immunity develops in 
response to the expanding commensal burden. Physiologically, the 
persistent IL-22 production from group 3 ILCs that occurs in the 
absence of adaptive CD4* T-cell activity results in impaired host 
lipid metabolism by decreasing lipid transporter expression in the 
small bowel. These findings provide new insights into how innate 
and adaptive lymphocytes operate sequentially and in distinct ways 
during normal development to establish steady-state commensalism 
and tissue metabolic homeostasis. 

To study the state and activity of innate and adaptive immune cells 
in situ in the context of the microbiota, we used quantitative multi- 
plex immunohistochemistry (histo-cytometry)*"'°. Staining for both 
cell phenotype and molecules that indicate active cytokine signalling 
enabled us to simultaneously examine the cell distribution, cytokine 
production and topography of cytokine responses in a complex tissue. 
Because many relevant cytokines induce phosphorylation of STAT3", 
we focused on the presence and distribution of phosphorylated STAT3 
(pSTAT3) in the distal small intestine of animals with an intact (wild 
type) or adaptive-lymphocyte-deficient (RagI~/~) immune system 
under specific pathogen-free conditions. Using this approach, we 
observed substantial pSTAT3 in both CD3” ROR t*-group 3 innate 
lymphoid cells (ILC3s) and nearly all Ep>CAM* intestinal epithelial cells 
(IECs) in the distal small intestine of Rag!~/~ but not wild-type mice 
(Fig. la). RORyt* ILC3s occur as a number of subpopulations with 
different locations in the gut: CCR6* lymphoid-tissue inducer-like 
ILC3s reside mainly in cryptopatches and isolated lymphoid follicles, 
while NKp46* ILC3s and NKp46~ ILC3s are mostly located in the 
lamina propria’”. ILC3s in both locations were equally activated in 
Ragl~'~ mice (Fig. 1a, b and Extended Data Fig. 1). Because the small 
intestine is colonized with commensal microbiota, we examined 


whether these microorganisms had a role in the pSTAT3 signature by 
treating specific pathogen-free Rag1~'~ mice with broad-spectrum 
antibiotics or using germ-free RagI~'~ mice. All pSTATS signals in 
ILC3s and IECs were eliminated in these mice (Fig. 1c, d). Segmented 
filamentous bacteria (SFB) attach directly to IECs in the distal small 
intestine and contribute to Ty17 cell differentiation and ILC3 acti- 
vation in vivol*4, We therefore mono-colonized germ-free Rag] ~/~ 
mice with SFB, which resulted in robust STAT3 activation in ILC3s 
and IECs (Fig. 1c, d). We conclude that STAT3 phosphorylation in 
ILC3s and IECs arises from signalling induced by microbiota, and that 
defined microbes such as SFB have a major role in this signalling in 
Rag1~'~ mice. 

IL-23 and IL-22 are functionally linked cytokines with the potential 
to induce pSTAT3 in ILC3s and IECs, respectively'>'®. We therefore 
examined pSTAT3 in Rag1~'~ mice that were also deficient in 1/23a 
(which codes for the IL-23p19 subunit) or 1/22. In the absence of 
IL-23, pSTAT3 was no longer detected in ILC3s or IECs. By contrast, 
1122~'" Rag1~'~ mice had a similar proportion of pSTAT3* ILC3s as 
Ragl~/~ mice, but lacked pSTAT3* IECs. The pSTAT3 signature in 
ILC3s and IECs was independent of IL-6 (Extended Data Fig. 2a, b). 
These data reveal that pSTAT3 generation in ILC3s depends on IL-23 
but not on IL-6 or IL-22, whereas pSTAT3 in IECs depends on both 
IL-23 and IL-22. Because ILC3s and not IECs express the IL-23 recep- 
tor, it seemed likely that STAT3 activation in ILC3s and IECs were 
sequential events downstream of IL-23 and IL-22, respectively. To test 
this hypothesis, we examined Rorc(- yt) OF 'P/GEP Ragl —!— mice, which spe- 
cifically lack ILC3s. The absence of pSTAT3 in IECs of these mice sug- 
gests that ILC3s are critical for STATS activation in epithelial cells in the 
small intestine of Rag!~/~ mice (Extended Data Fig. 2c). Examination 
of Ragl~/~ mice expressing an [/22-tdTomato reporter showed that 
pSTAT3* ILC3s produced IL-22 (Extended Data Fig. 2d). To investi- 
gate the source of IL-23, we isolated different myeloid-cell populations 
from the small intestine of wild-type and RagI~'~ mice” (Extended 
Data Fig. 3a). CD1 1b* conventional dendritic cells, as well as CCR2* 
monocytes and monocyte-derived dendritic cells, can all express [123a, 
but CCR2* cells from Ragl~/~ mice had higher 1/23a expression than 
those from wild-type mice, whereas there was no significant difference 
in 1123a expression in the CD11b* populations (Extended Data Fig. 
3b). To determine whether IL-23 production from CCR2* myeloid 
cells was responsible for the pSTAT3 signature in Rag1~/~ mice, we 
depleted these cells by injecting these mice with a CCR2 antibody. After 
two weeks, there were no pSTAT3* ILC3s or IECs in antibody-treated 
Ragl~'~ mice (Extended Data Fig. 3c), which indicates that IL-23 
production in CCR2* cells is critical for STAT3 activation in the small 
intestine of Rag] ~/~ mice. Similar results were obtained using a more 
broadly depleting antibody against GR1 (Extended Data Fig. 3c). 
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Figure 1 | pSTAT3* ILC3s and IECs induced by microbiota in small 
intestine of Ragl —'~ mice. a—-d, Immunofluorescence staining of ileum 

(a, c) and quantification of pSTAT3* ILC3s (b, d) from wild-type (WT) or 
Ragl-'~ mice (n =6;a, b), and from specific pathogen-free (SPF) Rag1 te 
mice with or without antibiotics treatment (1 = 3), germ-free Rag]~'~ mice 
(n=4) and germ-free Ragl ~'~ mice mono-colonized with SEB (n =7; ¢, d). 
ABX, antibiotics; GF, germ-free. Results are representative of two (c, d) or 
three (a, b) independent experiments. Bars indicate mean; ****P < 0.0001, 
otherwise exact P values are shown; two-tailed Student's t-test (b) or one- 
way ANOVA (d). Scale bars, 501m (except a, left, 500 j1m). 


Together, IL-23 and IL-22 constitute a circuit that involves ILC3s and 
produces strong signalling in IECs in animals that lack an adaptive 
immune system. Though such a circuit was previously reported upon 
pathogen infection of Rag1-deficient mice’, these new findings show 
that ILC3s and IECs are robustly and persistently activated by the com- 
mensal microbiota in the absence of adaptive immunity. 

Responses to microbes typically involve sequential activity of the 
innate and then adaptive immune systems. However, whether ILC 
responses that are thought to qualitatively parallel those of CD4* 
T cells with which they share master transcription factor expression 
(for example, ROR expression) operate sequentially or in parallel is 
unknown. The difference in IL-22-dependent IEC signalling between 
wild-type and Rag1~'~ adult mice suggested that in wild-type mice, 
innate cells might be activated by the microbiota before an effective 
adaptive immune response develops. To explore this possibility, we 
examined ileal pSTAT3 in wild-type and Ragl~'~ progeny of SFBt 
mothers. In neonatal mice, which have a less diverse microbiota 
and limited SFB colonization’, neither wild-type nor Ragl~/~ mice 
had pSTAT3* ILC3s or IECs. However, shortly after weaning, when 
bacterial colonization and expansion of the SFB population occurs, 
there was substantial activation of ILC3s and STAT3 signalling in 
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Figure 2 | Transient activation of ILC3s and IECs shortly after weaning 
in wild-type mice. a, Immunofluorescence ae of ileum from wild- 
type and Ragl~/~ mice at the indicated ages (n = 5). b, Quantification of 
pSTAT3* ILC3s in a. c, Expression of the emer genes in the ileum 
from mice as in a (n=5). Results are representative of two independent 
experiments. Mean + s.d.; ****P < 0.0001, otherwise exact P values are 
shown; two-way ANOVA. Scale bars, 501m. 


IECs in wild-type mice, although this was less robust than in Rag]~/~ 
mice (Fig. 2a, b). As the adaptive immune system of wild-type mice 
matured, ILC3s were no longer activated, whereas the activation per- 
sisted at high levels in ILC3s in Rag] ~'~ mice (Fig. 2a, b). Expression of 
1/22 and host-defence genes correlated precisely with ILC3 activation 
(Fig. 2c). These data indicate that during ontogeny, innate lymphoid 
cells operate before the adaptive system has fully developed’®”°. The 
emerging adaptive response then largely silences the ILC response and 
establishes a homeostatic state of non-inflammatory commensalism. 
Notably, the absence of the pSTAT3 signature in IECs from adult 
wild-type mice indicates that the effector mechanisms of the adaptive 
immune response to commensals is mechanistically distinct from that 
of ILC3s and that the existing paradigm that equates the effector func- 
tions of ILCs with subsets of CD4*-effector-T cells is not universally 
applicable?!. 

To study the potentially distinct effects of innate and adaptive 
lymphocytes on the microbiota, we measured the abundance of 
different bacterial species in the small intestines of co-housed wild- 
type, RagI~'~ and I123a~/" Rag ~~ mice. In the absence of adaptive 
immunity, most bacteria, including SFB, were increased in Rag] ~/~ 
mice. However, in the absence of STAT3 activation, SFB abundance 
was further increased in [23a /~Rag1~'~ mice (Extended Data Fig. 4a). 
Although both innate and adaptive lymphocytes control the quantity 
of SFB, scanning electron microscopy revealed that SFB have markedly 
different morphology depending on whether adaptive lymphocytes 
are present. In the ileum of Ragl~/~ mice, activated ILC3s prevented 
development of SFB into long filamentous forms, and the adaptive lym- 
phocytes limited the number of SFB that attached to epithelial cells 
(Extended Data Fig. 4b, c). These findings show that innate and adap- 
tive lymphocytes adopt different strategies to regulate the commensal 
state, with the activity of adaptive lymphocytes dominating over that 
of innate lymphoid cells under these conditions. 

We next investigated which host adaptive immune cells prevent 
ILC3 activation and epithelial-cell signalling. First, we compared the 
distribution of pSTAT3 in the ileum of wild-type, Ragl —— Tera~!— 
(T cell-deficient) and Ighm~'~ (B cell-deficient) mice. Neither T cell 
nor B cell-deficient mice had pSTAT3+ ILC3s or IECs (Extended 
Data Fig. 5a, b). As the pSTAT3 signature in Rag] /~ mice depended 
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Figure 3 | Role of CD4* T cells in controlling ILC3 and IEC activation. 
Immunofluorescence staining of ileum (a, c and e) and quantification 

of pSTAT3* ILC3s (b, d and f) from co-housed wild-type, RagI~/~, 
Tera~'~ and Ighm~!~ mice (n = 4; a, b); co-housed Ragl —'— (n=6), 
B2m~'~ (n=3), H2-Ab1~'~ (n=4) and H2-Ab1~'"B2m~'~ (n=4) 

mice (c, d); and Rag ~~ mice (n=6) or Ragl ~'~ mice with adoptive 
transfer of CD4* T cells (n =6; e, f). Results are representative of two 
independent experiments. Bars indicate mean; ****P < 0.0001, 
otherwise exact P values are shown; one-way ANOVA (b, d) or 
two-tailed Student's t-test (f). Scale bars, 50 1m. 


on specific bacteria such as SFB (Fig. 1c), we co-housed wild-type, 
Ragl~'~, Tera~/~ and Ighm~'~ mice for 4-6 weeks and examined 
STAT3 phosphorylation in these animals. After co-housing, ILC3s 
and IECs from Tera~/~ mice but not Ighm~/~ mice showed strong 
STAT3 phosphorylation (Fig. 3a, b), consistent with a change in the 
abundance of SFB in the co-housed Tcra~'~ mice (Extended Data 
Fig. 5c). These data show that o8-T cells are critical for preventing 
ILC3 activation and IEC signalling induced by SFB. As the presence 
of SFB was critical for the pSTAT3 signature, all the mice used in 
the following experiments were co-housed with Rag1~'~ mice unless 
otherwise noted. 

To identify whether CD4* helper or CD8* cytotoxic cells were 
responsible for inhibiting ILC3 activation, we examined STATS acti- 
vation in MHCII-deficient (H2-Ab1~'~), MHCI-deficient (B2m~‘~) 
and MHCI and MHCII-double-deficient (H2-Ab1~/~B2m~'~) 
mice, which specifically lack CD4* T cells, CD8* T cells or both, 
respectively. After co-housing with Rag] ~/~ mice, both H2-Ab1~/~ and 
H2-Ab1~'~B2m~'~ mice had pSTAT3+ ILC3s and IECs, whereas the 
equivalent cells in B2m~'~ mice were pSTAT3~ (Fig. 3c, d). To investi- 
gate whether CD4* T cells were sufficient to suppress ILC3 activation, 
we adoptively transferred total CD4* T cells into Rag] ~/~ mice and 
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Figure 4 | Suppression of ILC3 activation by T,-g and Ty17 cells. 

a-c, Immunofluorescence staining of ileum (a) and quantification of 
pSTAT3* ILC3s (b) and SFB (c) from the small intestine of Rag1~/~ mice 
(n=7) or Ragl~/~ mice with adoptive transfer of CD4* FOXP3-GFP* 
Treg cells treated with IL-2-IL-2-antibody complexes (n = 7). d, Expression 
of 1123p19 in CD64* CCR2~ macrophages (n = 5) and CCR2* myeloid 
cells (n =5) from small intestine of mice as in a. e-j, Immunofluorescence 
staining of ileum (e, h) and quantification of pSTAT3* ILC3s (f, i) and 
SEB (g, j) in distal small intestine of Rag1~/~ mice (n= 4) and Ragl~/~ 
mice with adoptive transfer of 7B8-Tg T cells (n= 6) or CBirl-Tg T cells 
(n=7) (e-g), and Rag —!— mice (n=5) and Ragl ~'— mice with adoptive 
transfer of 7B8-Tg T cells treated with anti-IL-17A antibody (n=5) or 
control antibody (n=5) (h-j). Results are representative of two independent 
experiments. Bars show mean (b, ¢, f, g, iand j) or mean +s.d. (d). 

#8 D < ().0001, otherwise exact P values are shown; two-tailed Student’s 
t-test (b-d) or one-way ANOVA (f, g, i and j). Scale bars, 50 jm. 


found that they prevented STAT3 phosphorylation in ILC3s and 
IECs (Fig. 3e, f). Reduction of ILC3 activity by CD4* T cells has been 
reported previously”’, but the mechanism was not addressed. FOXP3* 
Treg and T};17 cells are the most abundant subsets of CD4* T cells in the 
intestinal lamina propria”. To examine the possible role of Tyeg cells, 
we isolated these cells from Foxp3°'? mice and transferred them into 
Ragl~'~ mice along with IL-2-IL-2-antibody complexes to maintain 
their number and function». Six to eight weeks after transfer, SFB 
abundance in the small intestine of these mice had not changed, but 
pSTAT3 in ILC3s and IECs was diminished (Fig. 4a-c). This effect 
was concordant with reduced [/23a expression in CCR2* myeloid-cell 
populations (Fig. 4d). SFB preferentially induces antigen-specific 
effector CD4+ Ty17 cells!?°. To examine the role of Ty17 cells in 
controlling ILC3 activation, we took advantage of SFB-specific T cell 
receptor transgenic mice (7B8-Tg). Adoptive transfer of naive 7B8-Tg 
CD4* T cells into Rag ~/~ recipients markedly decreased SFB abun- 
dance in the small intestine, limiting ILC3 activation. By comparison, 
CBirl-Tg CD4* T cells, which recognize commensal-derived 
flagellin, had only a modest effect on the abundance of SFB and ILC3 
activation (Fig. 4e-g). Treatment with IL-17A-neutralizing antibodies 
partially reversed the effect of 7B8-Tg cells on ILC3 activation and 
SFB reduction; the incomplete nature of this effect is likely to be due 
to activities of other cytokines such as IL-17F (Fig. 4h, i). These data 
indicate that both T,.-g and Ty17 cells contribute to preventing ILC3 
activation: Tyeg cells do so by decreasing IL-23 production from CCR2* 
monocytes and monocyte-derived dendritic cells, whereas effector T 
cells regulate the bacterial burden. Both of these mechanisms prevent 
the activation of the IL-23-ILC3-IL-22-IEC circuit. 
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Figure 5 | Disrupted lipid metabolism in ILC3 and IEC-activated mice. 

a, RNA-seq analysis of ileal tissue from co-housed (Coh) wild-type, Rag1~/~ 
and Tera~/~ mice and non-co-housed Tera~/~ mice (n= 3). b-g, Expression 
of indicated genes (b, e), serum triglyceride and free fatty acid levels (c, f), 
and body composition (d, g) of wild-type (1 =5) or Rag1~'~ mice 

(n=6) (b-d), and Rorc(yt)*/* Rag1~'~ (n=5) or Rorc(yt)"’"Rag1/— 
mice (n=5) (e-g). Results are representative of two independent 
experiments. Bars show mean (b, d, e and g) or mean +s.d. (c and f). 

Exact P values are given and calculated by two-tailed Student's t-test. 


The evidence that ILC3s and adaptive lymphocytes operate sequen- 
tially and use markedly distinct effector mechanisms in their inter- 
actions with commensal bacteria raised the question of whether 
persistent ILC3 activity in the absence of adaptive immune control 
might have a negative effect on host homeostasis. To examine this issue, 
we performed whole-tissue RNA sequencing (RNA-seq) analysis of ilea 
from co-housed wild-type, Ragl~/~ and Tera~'~ mice and separately 
housed Tcra~/~ mice. Genes that encode cytokines and anti-microbial 
peptides involved in microbial control were expressed at higher levels in 
pSTAT3* Ragl~/~ and co-housed Tcra~'~ mice. By contrast, metabolic 
processing genes showed substantially lower expression in pSTAT3* 
mice (Fig. 5a). Using quantitative real-time PCR (qPCR) analysis, we 
confirmed the reduction in mRNA coding for key lipid transporters, 
including Cd36, Npcil1, Fabp1 and Fabp2, in Rag1~'~ and co-housed 
Tera '~ mice (Fig. 5b and Extended Data Fig. 6a). This change was 
associated with significantly decreased serum levels of triglycerides 
and free fatty acids (Fig. 5c and Extended Data Fig. 6b). At the macro- 
scopic level, Ragl~/~ mice exhibited less fat accumulation than wild- 
type mice that were fed a standard chow diet (Fig. 5d). Elimination of 
ILC3s in Rorc(yt)F"/S!”Rag1~'~ mice had the opposite effect, with 
higher levels of lipid transporters, serum triglycerides and free fatty 
acids, and higher fat storage than in Rag] ~/~ mice with activated ILC3s 
(Fig. 5e-g). To investigate whether these lipid abnormalities were 
specifically associated with persistent IL-22 production, we 
administered adenoviruses expressing IL-22 or GFP to wild-type 
and non-co-housed Tcra~!~ mice. Injection of adenovirus expressing 
IL-22 induced notable STAT3 activation in IECs of wild-type mice, 
comparable to that seen in Rag]~/~ mice (Extended Data Fig. 7), and 
led to reduced expression of lipid transporters in the gut and lower 
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serum lipid concentration (Extended Data Fig. 6c-f). These findings 
suggest that, in the absence of adaptive lymphocytes, although ILCs 
are capable of constraining microbial communities their persistent 
activation results in abnormal lipid handling and tissue homeostasis. 
Previous studies suggested a role for IL-22 in lipid metabolism”*, with 
a recent report indicating that IL-22 promoted lipid transporter expres- 
sion in IECs in the small intestine”. It is likely that transient, low-level 
IL-22 production’ has a notably different effect than the high-level, 
persistent exposure of IECs to this cytokine that we report here. 

Previous studies of ILCs have focused on the roles of these cells in 
resistance to pathogens or in immunopathological conditions such as 
chronic inflammatory diseases’, with recent studies suggesting that they 
also have a role in neurobiological function”* °. Here we demonstrate a 
clear role for these cells in the establishment of a compatible commensal 
state. Early in life, they temper the expansion of some bacterial species, 
especially those with inflammatory potential, protecting the epithe- 
lium of the gut. As the adaptive immune system matures, CD4* T cells 
respond to the commensal population, and through mechanisms other 
than the IL-23-induced IL-22 pathway used by the ILC3s, establish a 
state of non-inflammatory commensalism in which the ILCs are largely 
quiescent. In the absence of a dominant adaptive immune response, the 
persistent activation of ILC3s results in impaired lipid metabolism. Our 
findings may have bearing on recent studies that show that a failure 
to develop a ‘mature’ gut microbiota is associated with severe nutri- 
tional abnormalities and other pathological states in humans as well 
as in germ-free animals colonized with microbial material from indi- 
viduals with this condition?! **. Taken together, our results increase 
understanding of how the innate and adaptive immune systems act 
sequentially on the developing gut microbiota to establish a balanced 
commensal state that supports normal tissue function. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Data reporting. Mice of similar ages were randomly allocated into different 
groups. For most experiments (co-housing, antibody treatment and cell transfer 
experiments), mice were ear-tagged with numbers and investigators did not know 
the identity of the specific samples until after data were analysed. 

Mice. C57BL/6, Ragl', 1123a~"-, Foxp3?"?, Tcra'-, Ighm—"-, B2m-'-, 
H2-Ab1~'~ and H2-Ab1~'~B2m~/~ mice were obtained from Taconic Laboratories 
through a special NIAID contract. Rore(+t)°"”’°"", [l6~/~ and 7B8-Tg mice were 
purchased from Jackson Laboratories. [/22-tdTomato mice were provided by 
S. K. Durum. [/22~/~ mice were provided by Genentech. Unless specified, mice 
used in the study were eight- to sixteen-week-old males. All mice were main- 
tained in specific-pathogen-free conditions at an Association for Assessment and 
Accreditation of Laboratory Animal Care-accredited animal facility at the NIAID 
and were used under a study protocol approved by NIAID Animal Care and Use 
Committee (National Institutes of Health). 

Germ-free Rag1~/~ mice were provided by the Penn Gnotobiotic Mouse Facility 
and experiments involving these mice were performed at the Penn Gnotobiotic 
Mouse Facility. Germ-free mice were maintained in sterile plastic isolator units 
and fed autoclaved LabDiet5021 mouse chow (LabDiet) and autoclaved water. 
Immunofluorescence staining and confocal imaging. The ileal portion of the 
small intestine was excised and prepared using the Swiss roll technique, then 
incubated in a fixation and permeabilization solution (BD Bioscience, 554722) 
overnight followed by dehydration in 30% sucrose before embedding in OCT 
compound (Sakura Finetek). 18-j1m sections were cut on a CM3050S cryostat 
(Leica) and adhered to Superfrost Plus slides (VWR). Frozen sections were 
treated with methanol for 20 min at —20°C and then permeabilized and blocked 
in PBS containing 0.3% Triton X-100 (Sigma-Aldrich) and 10% normal mouse 
serum (Jackson Immunoresearch) followed by staining with antibodies diluted 
in blocking buffer. The following antibodies were used for staining: anti-CD3 
(17A2; Biolegend), anti-EpCAM (G8.8, Biolegend), anti- ROR yt (AFKJS-9, 
eBioscience), anti-CD90.2 (30-H12, Biolegend) and anti-pSTAT3 (D3A7, Cell 
Signaling Technology). After staining, slides were mounted with Fluormount 
G (Southern Biotech), and examined on a Leica TCS SP8 confocal microscope. 
Images were analysed with Imaris software (Bitplane). 

Antibiotic treatment. Male six-week-old Rag1~/~ mice were provided with 
ampicillin (1 g/l), kanamycin (5 g/l), vancomycin (500 mg/l), neomycin trisulfate 
(1 g/l) and metronidazole (1 g/l) in drinking water for three weeks. All antibiotics 
were purchased from Sigma-Aldrich. 

Histo-cytometry. Histo-cytometry analysis was performed as previously 
described*!”, with minor modifications. In brief, multi-parameter confocal images 
were corrected for fluorophore spillover using the Leica Channel Dye Separation 
module. For analysis of pSTAT3* ILC3s, the ILC3 surface was constructed on the 
basis of the RORt channel, and the object statistics were exported into FlowJo X 
(TreeStar) for analysis and graphing with Prism (GraphPad). 

Co-housing and mono-colonization with SFB. Wild-type age-matched male mice 
were ear-tagged and housed in a cage with an equal number of the respective 
knockout mice for 4-8 weeks. For association of germ-free Rag]~/~ mice with SFB 
(a gift from Y. Umesaki), faecal pellets isolated from SFB mono-associated mice 
were reconstituted in sterile PBS and 200 11 of this suspension was administered 
to each germ-free mouse by gavage in a sterile isolator. SFB reconstitution was 
confirmed by qPCR of faecal 16S ribosomal DNA relative to negative germ-free 
controls as previously described**. Mono-associated mice were maintained for 
3-4 weeks before analysis. 

Lamina propria myeloid cell isolation. Small intestinal segments were treated 
with medium containing 5mM EDTA and 0.145 mg/ml dithiothreitol for 30 min 
at 37°C with constant stirring. Tissue was further digested with 100\1g/ml 
Liberase TL (Roche) and 500 1g/ml DNase I (Sigma-Aldrich), with continuous 
stirring at 37°C for 30 min. Digested tissue was forced through a Cellector 
tissue sieve (Bellco Glass) and passed through 70- and 40-{1m cell strainers. Cells 
were washed and incubated with a mixture of monoclonal antibodies containing 
anti-CD11c (N418, Biolegend), anti- MHC II (M5/114.15.2, eBioscience), anti- 
CD45(30-F11, BD Bioscience), anti-CD24 (M1/69, Biolegend), anti-CD64 
(X54-5/7.1, Biolegend), anti-CCR2 (#475301, R&D Systems), anti-CD11b (M1/70, 
eBioscience), anti-Ly-6C (HK1.4, Biolegend) and anti-CD103 (2E7, eBioscience), 
as well as monoclonal antibodies against the non-dendritic-cell components: anti- 
Ly-6G (1A8, BD Biosciences), anti-NK1.1 (PK136, BD Biosciences), anti-TCR 8 
(H57-597, BD Biosciences), anti-TCR 6 (GL3, BD Biosciences), anti-Siglec F 
(E50-2440, BD Biosciences) and anti-B220 (RA3-6B2, BD Biosciences). Different 
myeloid-cell populations were sorted by flow cytometry on a FACSAria. Purity 
was verified by flow cytometry on a FACSAria. The purity of all populations 
was > 99%. 

Adoptive transfer and antibody treatment. CD4* T cells were isolated from 
lymph nodes and spleens from wild-type or Foxp3°'? mice using the CD4* 


T Cell Isolation Kit (Miltenyi, 130-104-454). Tyeg cells were sorted according to 
GFP expression, and 1 x 107 total CD4* T cells or 1 x 10° Treg cells were trans- 
ferred into Rag1~'~ mice intravenously. Naive T cells from 7B8-Tg mice or 
from CBirl-Tg mice were isolated using the naive CD4* T cell Kit (Miltenyi, 
130-104-453) and transferred into Rag!~/~ mice intravenously (1 x 10° cells per 
mouse) for 6-8 weeks. 

IL-2-IL-2-antibody complexes were made by mixing 1 jug recombinant mouse 
IL-2 (Biolegend) with 51g anti-IL-2 monoclonal antibody (clone JES6-1, BioXcell) 
followed by incubation at 37°C for 30 min. Mice were injected intraperitoneally 
with IL-2-IL-2-antibody complexes, 150g anti-IL-17A (Clone 17F3, BioXcell) 
or isotype control IgG twice a week after T-cell transfer. For monocyte depletion, 
Ragl~'~ mice were injected intraperitoneally with 20 jg anti-CCR2 (clone MC-21, 
a gift from M. Mack), 200j.g anti-Gr1 (clone RB6-8C5, BioXcell) or isotype control 
IgG on days 0, 1, 2, 3, 5, 7, 9, 11 and 13. Mice were euthanized 24h after the last 
injection. 

Quantitative real-time PCR. RNA from the terminal ileum was isolated with 
Trizol reagent (Thermo Fisher Scientific). CDNA was synthesized using the iScript 
cDNA Synthesis kit (Bio-Rad) according to the manufacturer’s instructions. qPCR 
was performed using SsoFast EvaGreen Supermixes (Bio-Rad) or LightCycler 
480 Probes Master (Roche). Reactions were run with the CFX Connect Real- 
Time PCR Detection System (Bio-Rad). The primer and probe set for mI23p19 
(Mm00518984) was purchased from Thermo Fisher Scientific. See Extended Data 
Table 1 for a list of primers and probes used in this study. 

Analysis of microbiota in the small intestine. The contents of the small intestine 
were collected from different mice, and bacterial DNA was extracted with the 
QIAamp DNA Stool Kit (QIAGEN) according to the manufacturer's instructions. 
Different species of bacteria were quantified by qPCR with primers specific for 16S 
rRNA genes using SsoFast EvaGreen Supermixes (Bio-Rad). See Extended Data 
Table 1 for a list of primers. 

Scanning electron microscopy. Mouse terminal ilea (immediately proximal to the 
ileal-caecal junction) were collected, slit and fixed overnight in fresh fixative. All 
subsequent processing was carried out in a Pelco Biowave microwave (TedPella) at 
250 W under 15-inch Hg vacuum. The tissue pieces were washed in 0.1 M sodium 
cacodylate buffer and post-fixed in 1% osmium tetroxide reduced with 0.8% 
potassium ferrocyanide in 0.1 M sodium cacodylate buffer. After three washes with 
buffer, the tissue pieces were dehydrated in a graded ethanol series and dried using 
a Bal-Tec 030 critical point dryer. The tissue pieces were subsequently coated with 
70 A iridium using ion-beam sputtering (South Bay Technology) and imaged using 
a SU-8000 scanning electron microscope (Hitachi). 

RNA-seq analysis. Total RNA was extracted from distal small intestine using 
TRIzol (Thermo Fisher Scientific). RNA-seq libraries were prepared using the 
Illumina TruSeq Stranded LT mRNA library preparation kit (Illumina), and 
libraries were sequenced on an Illumina NextSeq 500 to a read depth of 25-35 
million reads per sample. Sequencing quality control was assessed using FastQC 
version 0.11.3 (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Reads 
were mapped to the UCSC mm9 Mus musculus genome using Tophat2*4 with the 
following flags: -library-type fr-firststrand -r 25. The tool featureCounts from the 
subread* Python package was used to count reads mapping to gene features, and 
the R package DEseq2_ENREF_38*° was used to test for differential expression 
and to generate regularized log-transformed (rlog) count tables. The R package 
pheatmap was used to generate the heatmap of the RNA-seq results, and ggplot2 
was used to generate scatterplots of the rlog counts. 

Serum triglycerides, free fatty acids and body composition measurement. 
Serum triglycerides and free fatty acids were measured using a Triglyceride Reagent 
Set (Pointe Scientific, Inc.) and Free Fatty Acid Quantitation Kit (Sigma-Aldrich) 
according to the manufacturer's instructions. Mouse body composition, including 
fat and lean masses, was measured with EchoMRI at four months of age. 
Administration of mice with IL-22 adenovirus. Adenoviruses expressing IL-22 
and GFP were provided by B. Gao. IL-22 adenovirus was made by cloning mouse 
IL-22 cDNA (544bp) into the pENTR/D-TOPO system (Invitrogen), followed by 
using the Gateway system (Invitrogen) to perform an LR reaction with pAd/CMV/ 
V5-DEST to make the expression vector pAd/CMV/mIL-22. Mice were injected 
intravenously with 2 x 10° pfu IL-22 adenovirus or GFP adenovirus. 

Statistical analysis. No statistical methods were used to predetermine sample 
size. Prism software (GraphPad) was used for all statistical analysis. Student's 
t-test (two-tailed), or one-way or two-way ANOVA were used for the statistical 
analysis of differences between two groups; ****P < 0.0001, and exact P values 
are shown in figures. 

Data availability. RNA-seq data that support the findings of this study have 
been deposited in the Gene Expression Omnibus database with the accession 
number GSE86780. Figure 5a shows RNA-seq-related data. The data that sup- 
port the findings of this study are available from the corresponding author upon 
reasonable request. 
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Extended Data Figure 1 | Quantification of pSTAT3* ILC3s by histo-cytometry. Gating strategy for analysis of pSTAT3* ILC3s from small intestine of 
wild-type or Rag1~/~ mice. 
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Extended Data Figure 2 | Cellular and molecular mechanism of Rore(yt)“!"’* Rag1~'~ and Rorc(yt)@"/“¥Rag1—'~ mice (n= 4; ¢) and 
STAT3 activation in Rag1~'~ small intestine. a, Immunofluorescence 1122-tdTomato Rag]~/~ mice (n = 3; d). Results are representative of three 
staining of ileum from Rag] ‘~ (n=4), 1123a~’ Ragl /— (n=5), independent experiments. Bars show mean; exact P values are given and 


11227 Ragl ‘~ (n=4) and Il6~’ “Ragl” ‘~ mice (n=4). b, Percentage of calculated by one-way ANOVA. 
pSTAT3* ILC3s in a. c-d, Immunofluorescence staining of ileum from 
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Extended Data Figure 3 | Mononuclear phagocyte subpopulation 
responsible for IL-23 production and pSTAT3 activation. a, Flow 
cytometry of total live cells from the small intestine lamina propria, 
showing the gating strategy for sorting different myeloid-cell subsets: 
CD103*CD11b~ and CD11b* conventional dendritic cells, CD64*CCR27 
macrophages and CCR2* monocytes and monocyte-derived dendritic cells. 


b, Expression of [/23a (n = 3) and I112b (which encodes the 

p40 subunit of IL-12) in different cell populations sorted as ina 

(n =2). c, Immunofluorescence staining of ileum from Rag1~/~ mice and 
Rag1~/~ mice treated with anti-CCR2 or anti-Gr1 antibody for two weeks 
(n=4). Results are representative of three independent experiments. 
Mean + s.d.; exact P values are given and calculated by two-way ANOVA. 
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Extended Data Figure 4 | Microorganisms in the small intestine from two independent experiments. b, Scanning electron microscopy of 


of co-housed wild-type, Rag1~/~ and I123a~'~ Rag1~'~ mice. terminal ileum of co-housed mice as in a (n = 3). c, Quantification of the 
a, Quantification of indicated bacteria species in the ileum of co-housed length of SFB filaments in b. Results are representative of two independent 
wild-type (n= 6), Ragl~'~ (n=7) and II23a~/- Rag1~/~ (n=8) mice by experiments. Bars show mean (a) and mean +s.d. (c); exact P values are 


real-time PCR with primers specific to 16S rRNA genes. Results are pooled _ given and calculated by one-way ANOVA. 
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Extended Data Figure 5 | Lack of ILC3 activation in SFB negative 
Tcra~/~ mice. a, Immunofluorescence staining of ileum from wild-type 
(n=6), Ragl ~~ (n=5), Tera-’~ (n=5) and Ighm~/— mice (n=6). 

b, Percentage of pSTAT3~ ILC3s in a. c, Quantification of SFB in distal 
small intestine of non-co-housed Ragl~/~ and Tera~/~ or co-housed 
Ragl~'~ and Tera~'~ mice (n= 4). Results are representative of 

three (a, b) or two (c) independent experiments. Bars show mean; 
#8 D < (),0001; otherwise exact P values are shown and calculated 

by one-way ANOVA. 
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Extended Data Figure 6 | Dysregulation of lipid metabolism by IL-22. the ileum (c, e) and serum triglyceride and free fatty acid levels (d, f) from 
a, Expression of indicated genes in the ileum from age-matched male wild-type (c, d) or Tera~/~ mice (e, f) injected with adenovirus expressing 
Tcra~'~ mice and Tcra~'~ mice co-housed with Ragl ~'— mice (n=8). IL-22 (n=5) or GFP (n=5). Results are representative of two independent 
Results are pooled from two independent experiments. b, Serum experiments. Bars show mean (a, c, e) and mean + s.d. (b, d, f); exact 
triglyceride and free fatty acid levels from Tera~'~ mice and Tera~'~ mice P values are given and calculated by two-tailed Student's t-test. 


co-housed with Ragl~/~ mice (n=5). c-f, Expression of indicated genes in 
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Ad-GFP 


* 


Ad-IL-22 


Extended Data Figure 7 | STAT3 activation in IECs by IL-22 adenovirus. 
Immunofluorescence staining of ileum from C57BL/6 mice injected with 
adenoviruses expressing either IL-22 or GFP for two weeks. Images are 
representative of three sections from three mice of each group and results 
are representative of two independent experiments. 
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Extended Data Table 1 | Primers and probes for quantitative PCR 


fizatw | cATGcaGcaccTectaccrT 
[SFBFW | GACGCTGAGGCATGAGAGCAT 
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Dynamics and number of trans-SNARE complexes 
determine nascent fusion pore properties 


Huan Bao!**, Debasis Das!?*, Nicholas A. Courtney!”, Yihao Jiang!, Joseph S. Briguglio!’, Xiaochu Lou!, Daniel Roston?, 


Qiang Cui, Baron Chanda! & Edwin R. Chapman! 


The fusion pore is the first crucial intermediate formed during 
exocytosis, yet little is known about the mechanisms that determine 
the size and kinetic properties of these transient structures!. 
Here, we reduced the number of available SNAREs (proteins 
that mediate vesicle fusion) in neurons and observed changes in 
transmitter release that are suggestive of alterations in fusion 
pores. To investigate these changes, we employed reconstituted 
fusion assays using nanodiscs to trap pores in their initial open 
state. Optical measurements revealed that increasing the number 
of SNARE complexes enhanced the rate of release from single pores 
and enabled the escape of larger cargoes. To determine whether 
this effect was due to changes in nascent pore size or to changes in 
stability, we developed an approach that uses nanodiscs and planar 
lipid bilayer electrophysiology to afford microsecond resolution at 
the single event level. Both pore size and stability were affected by 
SNARE copy number. Increasing the number of vesicle (v)-SNAREs 
per nanodisc from three to five caused a twofold increase in pore size 
and decreased the rate of pore closure by more than three orders of 
magnitude. Moreover, pairing of v-SNAREs and target (t)-SNAREs 
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to form trans-SNARE complexes was highly dynamic: flickering 
nascent pores closed upon addition of a v-SNARE fragment, 
revealing that the fully assembled, stable SNARE complex does 
not form at this stage of exocytosis. Finally, a deletion at the base 
of the SNARE complex, which mimics the action of botulinum 
neurotoxin A, markedly reduced fusion pore stability. In summary, 
trans-SNARE complexes are dynamic, and the number of SNAREs 
recruited to drive fusion determines fundamental properties of 
individual pores. 

To understand how membranes fuse during exocytosis, the structure 
and dynamics of the first crucial intermediate, the fusion pore, must be 
determined'. Moreover, fusion pore properties can affect cargo release 
from neuroendocrine cells’, and can potentially alter aspects of synaptic 
transmission**. For example, small unstable pores would allow only 
transient release of small hormones from neuroendocrine cells, and 
could, in principle, limit the rate of glutamate release from nerve termi- 
nals to reduce postsynaptic responses during synaptic transmission». 
Surprisingly little is known about the factors that determine the size 
and dynamics of fusion pores, but it has been hypothesized that at 


Figure 1 | Reducing v-SNARE or t-SNARE availability alters the 

shape of mEPSCs. a, Cleavage of SYB2 by TeNT. PM, plasma membrane. 
b, c, Representative traces (b) and quantification of mEPSC frequency 
(freq.; c) after treatment with TeNT. Control: 0.81 + 0.28 Hz (95% 
confidence interval 0.18-1.44), n= 10; 100 pM TeNT: 0.10 + 0.02 Hz 
(0.04-0.17), n = 10; P< 0.001 for comparison of all three groups by 
Kruskal-Wallis U-test; P= 0.001 for control versus 100 pM, Dunn’s 
multiple comparison post hoc test. Inset in ¢ shows immunoblot of SYB2, 
B-actin and synaptotagmin 1 (SYT1) in control and TeNT-treated neurons. 
Similar results were obtained in three independent trials. d, Averaged 
mEPSC traces after treatment with TeNT (left). Amplitudes, control: 

27 +3 pA (—20 to —33), n= 10 neurons; 100 pM TeNT: —19+2 pA 
(—15 to —22), n= 10 neurons; P= 0.021, two-tailed t-test. Ten to ninety 
per cent rise times, control: 1.2 + 0.2 ms (0.9-1.6 ms), n= 10 neurons; 

100 pM TeNT: 1.8 + 0.2 ms (1.3-2.2), n= 10 neurons; P= 0.020, two-tailed 
t-test. e, Averaged traces (left) and quantification (right) of y-DGG- 
mediated inhibition of mEPSCs. Control: 8 + 3% reduction in amplitude 
(1-16), n=7 neurons; 100 pM TeNT: 23 + 3% inhibition (17-31), n=6 
neurons; P= 0.005, two-tailed Mann-Whitney test. a-e, Experiments 
were performed using one coverslip from each of three independent litters 
of mice. f, Cd-SYB2 occupies t-SNAREs to inhibit fusion. g, Averaged 
mEPSC traces (left); frequencies, amplitudes, and rise times are plotted 

on the right. Frequency: control: 1.6 + 0.3 Hz (1.0-2.1), n= 19; cd-SYB2; 
0.9 + 0.2 Hz (0.6-1.2), n =21; P=0.049, two-tailed Mann-Whitney test. 
Amplitude: control: 17 + 1 pA (—15 to —20); cd-SYB2: —13+ 1 pA (—12 to 
—15); P=0.017, Welch’s two-tailed t-test. Rise time: control: 1.1 +0.1 ms 
(0.9-1.3); cd-SYB2: 1.40.1 ms (1.2-1.7); P= 0.021, Welch’s two-tailed 
t-test. In f, g, experiments were performed using two litters, two coverslips 
per litter. *P < 0.05; **P <0.01. Data are presented as mean + s.e.m. 

(95% confidence interval). 
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Figure 2 | Reconstituted fusion assays reveal changes in cargo efflux as mean + s.d., n=3 independent experiments. e, Illustration of single- 


rates as a function of SYB2 copy number. a, Ensemble fusion assay using 
nanodiscs and SUVs b, Release time courses for different maltodextrins 
from vesicles using ND6. c, Maltodextrin release efficiency against SYB2 
copy number per nanodisc. d, Dithionite quenching of N-(7-nitro-2- 
1,3-benzoxadiazol-4-yl) (NBD)-labelled lipid revealed that the number 
of open pores was the same for ND3 and ND8. Data in b-d are shown 


vesicle fusion assays. f, Left, representative trace showing sulforhodamine b 
(SRB) efflux through a single fusion pore with or without nanodiscs 
(ND); right, expanded time scale, fitted with a single exponential function 
(red). Similar results were obtained in four independent experiments. 

a.u., arbitrary units. g, SRB efflux rates using ND3, ND5 and ND7; n=54, 51, 
and 53 respectively; four independent trials per condition. 
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Figure 3 | Properties of single fusion pores measured by planar lipid 
bilayer electrophysiology. a, Illustration of the nanodisc-BLM assay. 

b, Traces of single pores at AY=—50 mV for NDO, ND3, ND5 and ND7. 
Closed (C) and open (O) states are indicated, along with the respective 
currents. c, Pore currents obtained using the indicated nanodisc; red 
arrows indicate representative pores shown in b. ND3: —4+1 pA 

(—3 to —5 pA); ND5: —11+1 pA (—9 to —13 pA); ND7: -18+2 pA 
(—15 to —21 pA); P< 0.001, Kruskal-Wallis U-test. ND3 versus ND5, 


P<0.001; ND3 versus ND7, P< 0.001; ND5 versus ND7, P= 0.036, 
Dunn's multiple comparison post hoc tests. Data shown as mean + s.e.m. 
(95% confidence interval). d, Current—voltage (J-V) relationships for 
pores formed using the indicated nanodiscs. Data are presented as 

mean + s.e.m. e, Open dwell-time histograms of pores. n= 14, 20 and 

20 independent BLMs for ND3, ND5 and ND7 respectively; five sets of 
nanodiscs of each type were used. ***P < 0.001; *P=0.05. 
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Figure 4 | Trans-SNARE complexes are 
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and current histogram (b) of a pore formed 
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least three SNARE complexes are required to hold fusion pores open 
and allow efficient cargo release in a reconstituted system’. Here, we 
used a combination of approaches to determine directly whether the 
number of SNAREs that drive fusion dictate fundamental properties of 
nascent fusion pores, and whether trans-SNARE complexes are stable 
or dynamic. 

To examine how SNARE copy number influences neurotransmitter 
efflux from synaptic vesicles, we recorded a-amino-3-hydroxy-5- 
methyl-4-isoxazole propionic-acid receptor (AMPAR)-mediated 
miniature excitatory postsynaptic currents (mEPSCs) in neuronal 
cultures treated with tetanus toxin (TeNT) (Fig. la-c). Although 
higher doses nearly abolished mEPSCs (Fig. 1b, c), 100 pM TeNT, 
which caused 50% cleavage of the vesicle-associated membrane protein 
SYB2 (also known as VAMP2; Fig. 1c), yielded a sufficient frequency 
of mEPSCs to enable quantitative analysis (Fig. 1b). Notably, mEPSCs 
that remained after TeNT treatment had smaller amplitudes and 
slower rise kinetics (Fig. 1d) than untreated controls, consistent with a 
previous report examining miniature N-methyl-p-aspartate (NMDA)- 
mediated events®. Application of the rapidly disassociating, competi- 
tive AMPAR antagonist -\-p-glutamyl-glycine (y-DGG, 200 .M) 
inhibited mEPSCs in TeNT-treated neurons more potently than in 
untreated control cells (Fig. le). As inhibition by \~-DGG is inversely 
proportional to the glutamate concentration at receptors’, these find- 
ings suggest that TeNT treatment impaired the ability of glutamate to 
escape synaptic vesicles into the synaptic cleft. We tested this hypothesis 
further by overexpressing the cytosolic domain of SYB2 (cd-SYB2), 
which binds native t-SNAREs to inhibit trans-SNARE complex 
formation (Fig. 1f, g, Extended Data Fig. 1). Neuronal cultures expressing 
cd-SYB2 had fewer mEPSCs than control neurons, and these were 
smaller in amplitude and slower to rise. Together, these experiments 
suggest that the abundance of trans-SNARE pairs might modulate the 
efflux of glutamate through fusion pores. 

To test this hypothesis directly, we devised an in vitro assay to probe 
pores using cargoes of different sizes. As shown previously, fusion 
between small unilamellar vesicles (SUVs) and 13-nm nanodiscs 
results in pores that cannot dilate, owing to the rigid framework of the 
nanodisc, thus enabling the biochemical characterization of pores in 
their initial open state”’. We encapsulated a variety of maltodextrins 


in t-SNARE SUVs and incubated them with v-SNARE-bearing 
nanodiscs that harboured one to eight copies of SYB2, designated 
NDI-ND8&. Flux of cargo through fusion pores was monitored using 
an optical sensor that recognizes each of the maltodextrins used 
(Fig. 2a-c, Extended Data Fig. 2a, b). Using a fixed number of SNAREs 
(ND6), the rank order of cargo release rates correlated with cargo size 
(Fig. 2b): the smaller the cargo, the more quickly it escaped through 
fusion pores. As the number of SNAREs per fusion reaction was 
increased from one to eight, the rate of efflux of each maltodextrin 
(except for cyclodextrin) also increased, and larger cargoes were able 
to escape (Fig. 2c; estimates of pore size are provided in Extended 
Data Fig. 2c, d). Using a dithionite quenching assay, we found that 
the number of SNAREs did not influence the total number of open 
fusion pores that were formed (Fig. 2d, Extended Data Fig. 2e). 
Together, these data show that SNARE copy number determines the 
size and/or kinetic stability of individual fusion pores; these find- 
ings were confirmed using single-vesicle fusion assays (Fig. 2e-g, 
Extended Data Fig. 2f, g). 

To achieve sub-millisecond time resolution, we developed an approach 
for monitoring recombinant fusion pores electrophysiologically. 
t-SNAREs were reconstituted into black lipid membranes (BLMs), 
at a density of 0.4 molecules per jim? (Extended Data Fig. 3c, d) ina 
planar lipid bilayer electrophysiology setup!” (Fig. 3, Extended Data 
Figs 3a, b, eand 4). Addition of v-SNARE-bearing nanodiscs into the cis 
chamber resulted in the formation of single fusion pores, as evidenced 
by the currents detected (Fig. 3a, b). Control experiments establish that 
these are bona fide fusion pores that result from trans-SNARE pairing 
(Extended Data Table 1). Remarkable differences were observed among 
pores formed by ND3, ND5 and ND7 (Fig. 3b-e). At —50 mV, ND3 
produced pores that remained closed most of the time and flickered 
open only transiently (Fig. 3b, ND3). By contrast, pores formed by 
ND5 remained open most of the time, but closed transiently (Fig. 3b, 
NDS). Finally, ND7 pores remained open during the entire record- 
ing period; these pores flickered but never closed completely (Fig. 3b, 
ND7). To estimate pore size, we generated I-V plots (Fig. 3d). From 
the conductance values, the estimated diameters of pores formed using 
ND3, 5 and 7 were 1.1 +0.3nm, 2.2 +0.3nm and 2.9 + 0.3 nm, respec- 
tively, consistent with the range of pore sizes observed using nanodiscs 
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and ‘flipped’ t-SNAREs on the surface of cells!!. We made similar obser- 
vations using 50-nm nanodiscs (Extended Data Fig. 5). 

Kinetic analysis revealed differences in the open dwell-time dis- 
tribution between ND3 and ND5 or ND7 (Fig. 3e): increasing the 
number of SNAREs notably enhanced the stability of the open state 
(Extended Data Table 2). Moreover, even though the kinetic stabili- 
ties of pores formed using ND5 and ND7 were similar, pore size still 
increased at the higher copy number. Therefore, the size and dynam- 
ics of individual pores are differentially regulated by SNARE copy 
number. Similar results were obtained using yeast SNAREs””, estab- 
lishing the generality of these findings (Extended Data Fig. 6, Extended 
Data Table 2). 

As the nanodisc-BLM recordings showed that recombinant fusion 
pores, under all the measured conditions, rapidly convert between open 
and at least partially closed states, we hypothesized that the underlying 
trans-SNARE complexes exist in metastable conformational states. 
To test this hypothesis, we titrated cd-SYB2 onto pre-formed pores 
assembled using ND5 (Fig. 4a—d). Pores initially destabilized, and at 
the highest dose all pores eventually closed (Fig. 4a—c); at lower doses 
of cd-SYB2, we sometimes observed partial closure (Extended Data 
Fig. 7a, b). Addition of cd-SYB2 also closed fusion pores formed using 
ND7, albeit with reduced potency (Extended Data Fig. 7c, d). In con- 
trol experiments, bovine serum albumin had no effect on pores and 
cd-SYB2(4A), a mutant with impaired t-SNARE binding activity), had 
only limited effects at the highest dose tested (Extended Data Fig. 8a, b). 
Together, these findings demonstrate that in contrast to cis-SNARE 
complexes, which are highly stable!4, trans-SNARE interactions are 
dynamic and potentially reversible, even after pores have opened 
(Fig. 4d). Consistent with this conclusion, impairment of trans-SNARE 
interactions via a C-terminal truncation of SNAP-25B (consisting of 
residues 1-197) that mimics cleavage by botulinum neurotoxin A 
results in a marked increase in flickering behaviour without affecting 
pore size (Fig. 4e, Extended Data Fig. 8c). Truncation of twenty residues 
(to leave residues 1-186) completely abolished pore formation (Fig. 4e). 
These data indicate that trans-SNARE interactions, at the base of the 
SNARE complex, control pore dynamics. 

In summary, the exocytotic fusion pore corresponds to the initial, 
narrow channel formed between secretory vesicles and the plasma 
membrane. Release of neurotransmitters and hormones occurs via 
diffusion through this transient structure before, or even without, 
dilation’. Previous electrophysiological measurements revealed a 
range of pore sizes!>!®, as well as flickering behaviour!”®, in cells; 
these observations are recapitulated in the nanodisc-BLM system 
described here. Our results provide direct experimental support for 
the idea that a certain number of SNAREs is needed to hold fusion 
pores open’, with more SNARES resulting in larger fusion pores!”. 
Moreover, even after fusion pores have opened, trans-SNARE 
complexes remain dynamic and reversible. It will be interesting to 
determine how far a fusion pore must dilate in order for the SNARE 
complex to become irreversible, and to ascertain the effect of myriad 
regulatory factors on the properties of individual pores”®”!. Finally, it 
will also be interesting to determine whether the findings concerning 
SNARE copy number reported here apply to other cellular fusogens”’, 
including atlastin (homotypic endoplasmic reticulum fusion”), mito- 
fusin 1 and 2 (mitochondria fusion‘), and the proteins that mediate 
ectoplasmic fusion”>. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized and the investigators were not blinded to allocation during 
experiments and outcome assessment. 

Reagents. 1,2-dioleoyl-sn-glycero-3-phosphoethanolamine-N-(biotinyl) 
(biotin-PE), 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (PC), 1,2-dioleoyl- 
sn-glycero-3-phospho-1-serine (PS), 1,2-dioleoyl-sn-glycero-3-phosphoethanolamine 
(PE), 2-dioleoyl-sn-glycero-3-phospho-(1’-rac-glycerol) (PG), 1,2-dioleoyl-sn- 
glycero-3-phosphoethanolamine-N-(7-nitro-2-1,3-benzoxadiazol-4-yl) (NBD-PE) 
and 1,2-dioleoyl-sn-glycero-3-phosphoethanolamine-N-(lissamine rhodamine 
B sulfonyl) (rhodamine-PE) were purchased from Avanti Polar Lipids. y-DGG was 
obtained from Abcam. All other chemicals were from Sigma-Aldrich. 

Cell culture and lentivirus. Cultured rat cortical neurons were prepared from 
embryonic day 18-19 Sprague Dawley rats as previously described”®. Sex was not 
determined, and neurons from all of the pups in each litter were pooled during 
culturing. In brief, neurons were plated on poly-p-lysine-coated glass coverslips 
(12mm) at a density of 100,000 cm~*. Neurons were cultured in Neurobasal 
A medium (Gibco) supplemented with B-27 (2%, Gibco) and GlutaMAX (2mM, 
Gibco) and maintained at 37°C in a 5% CO, humidified incubator. After day 13 
or 14 in vitro, half of the neuronal coverslips were treated with tetanus toxin (1 nM 
or 100 pM, List Biological Labs) at 37°C for 24h. Untreated neurons were used as 
controls. Electrophysiological recordings of both treated and untreated neurons 
were performed immediately following the 24-h treatment period (day 14-15 
in vitro). All procedures were approved by the Animal Care and Use Committee 
at the University of Wisconsin and performed in accordance with the guidelines 
of the National Institutes of Health. 

For the viral expression experiments, DNA sequences encoding either cd-SYB2 
(residues 1-95 of SYB2) or GFP were subcloned into a FUGW transfer plasmid 
modified with a synapsin promoter and an IRES-expressed soluble GFP marker. 
Lentivirus particles were generated by co-transfection of the transfer plasmid 
and helper plasmids (pCD/NL-BH* AAA and VSV-G encoding pLTR-G) into 
HEK293T/17 cells (ATCC, not tested for mycoplasma contamination)*’. The 
supernatant was collected after 48-72 h of expression, filtered through a 0.45-j1m 
PVDF filter, and concentrated by ultra-centrifugation at 110,000g for 2h. Viral 
particles were resuspended in Ca?+/Mg”*-free PBS and used to infect neurons at 
day 6 in vitro. Electrophysiological recordings were then performed at day 14-15 
in vitro. 

Immunocytochemistry. At 14 days in vitro, cell cultures were fixed for 15 min with 
4% paraformaldehyde (wt/vol) in PBS, permeabilized for 10 min with 0.2% saponin 
(wt/vol), and blocked for 60 min with 10% goat serum (vol/vol, Abcam) plus 0.1% 
Tween-20 (vol/vol). Coverslips were then incubated with primary antibodies (anti- 
GFP: Abcam, 1:1,000, chicken; anti- MAP2: EMD, 1:1,000, mouse) at room temper- 
ature for 1h. Samples were washed three times with 0.02% saponin (wt/vol) in PBS 
and labelled with Alexa Fluor 488-tagged anti-chicken and Alexa Fluor 546-tagged 
anti-mouse IgG (1:400, Invitrogen) for 1h at room temperature. Samples were 
again washed three times and mounted in Fluoromount G mounting medium 
(Southern Biotech). Images were obtained using an FV1000 laser-scanning 
confocal microscope (Olympus) with FV10-ASW 3.1 acquisition software, using 
a 20x 1.0 NA water objective, under identical laser and gain settings. Images were 
analysed using ImageJ (NIH). 

Protein purification and reconstitution. Membrane scaffold protein (MSP) 
for 13 nm? and 50nm?* nanodiscs, the maltose sensor’’, neuronal (rat SYB2, 
syntaxin-1A and SNAP-25B) and yeast (Snc2p, Ssolp and Sec9c (residues 
401-651)) SNARE proteins were purified as described previously’”. t-SNARE com- 
plexes bearing truncated SNAP-25B (corresponding to residues 1-197 or residues 
1-186) were also prepared and studied; the former truncation mimics cleavage by 
botulinum neurotoxin A*’. To prepare t-SNARE vesicles, lipids (10% PE, 15% PS 
and 75% PC) and the t-SNARE heterodimer were incubated with the respective car- 
goes and 2% octyl-(-glucoside on ice for 30 min. Detergent was removed by addi- 
tion of Biobeads (Bio-Rad) (one-third volume) followed by gentle shaking (at 4°C, 
overnight). The mixture was extruded through a 0.2-1M filter and the t-SNARE 
vesicles were purified by passing through a PD10 column (5 ml) equilibrated in 
reconstitution buffer (25 mM HEPES, pH 7.5, 100mM KCl, 1mM DTT). Finally, 
purified t-SNARE vesicles were dialysed against reconstitution buffer (4°C, over- 
night). Reconstitution of SYB2 into 13-nm nanodiscs was performed as described. 
For reconstitution of SYB2 into 50-nm nanodiscs, the MSP:lipid ratio was 2:4,000. 
To incorporate different copy numbers of SYB2 into 50-nm nanodiscs, the fol- 
lowing MSP:SYB2 ratios were used: 2:2 (ND3), 2:4 (ND5) and 2:10 (ND7). The 
reconstituted nanodiscs were incubated with Ni?'-NTA resin to remove SYB2-free 
nanodiscs. Nanodiscs containing SYB2 were eluted with reconstitution buffer con- 
taining 0.4 M imidazole. The nanodiscs were further purified via sucrose density- 
gradient centrifugation*’, followed by dialysis against reconstitution buffer (4°C, 


LETTER 


overnight). The copy number of SYB2 per nanodisc refers to the total number of 
SYB2 molecules, not the number of copies per face of the nanodisc. 
Ensemble fusion assays. Maltodextrin release assays were carried out using the 
maltose sensor (11M)”’, SYB2 nanodiscs (0.2 |1M), and t-SNARE vesicles (1 |1M) 
containing maltodextrins, at 37°C in reconstitution buffer. The fluorescence of the 
sensor was monitored for 1h using a plate reader (HT synergy, BioTek). After each 
run, 141M melittin was added to each sample, and data were collected for another 
30 min. Melittin forms channels to release all the maltodextrin from each vesicle, 
thus producing the maximum fluorescence signal (100%) that can be obtained. 
Data were collected from three independent experiments. 

Efflux rates were used to estimate fusion pore size. As described previously’, 
the time it takes for a single cargo molecule to traverse the fusion pore is Sr 

a(tp = fe 

in which a is the frequency of collisions of the cargo molecule with the membrane 
that forms the t-SNARE vesicle, and R, rp) and r, are the radii of the liposome, fusion 
pore and cargo, respectively. Thus, the difference in the release rates between 
maltose and maltotetraose is given by: 


2. 
Knaltose (% _ Tmaltose) 


2? 
Kmaltotetraose (tp — Tinaltotetraose) 


in which Kmattose 20d Kmattotetraose are the respective release rates for these sugars, 
and raltose (0.35 nm) and Fmaitotetraose (0.42 nm) are their radii as determined by 
molecular dynamics simulations. 

Molecular dynamics simulations to estimate maltodextrin size. All calcula- 
tions used the program CHARMM™, and dynamics calculations were conducted 
through the CHARMM interface with OpenMM”*. The sugars were treated with 
the CHARMM glycan force field***> and water was treated with the TIP3P model**. 
Initial structures for the sugars were obtained from the online tool SWEET”. The 
online tool CHARMM-GUI"***? was used to construct the initial setup for each 
sugar. The sugars were each dissolved in a cubic box of water such that the nearest 
edge of the box was at least 10 A away. After a brief geometry optimization, the 
systems were heated to 298 K (150 ps), followed by a 1-ns equilibration at 298 K 
and 1 atm using periodic boundary conditions. The simulations employed the 
NPT ensemble using the Andersen thermostat and MC barostat. Non-bonded 
interactions were cut off above a distance of 12 A with a switching function from 
10A to 12A, and the integration time step was 1 fs. Following equilibration, 10-ns 
production runs were used to determine the principal axes shown in Extended 
Data Fig. 2. 

Single vesicle fusion assays. A prism-based total internal reflection (TIRF) micros- 
copy setup and associated flow chambers were prepared as described previously’. 
t-SNARE vesicles containing SRB were prepared and immobilized on the surface 
of quartz slides as described*’. The trapping efficiency of SRB in vesicles was 
~2% of the total SRB and the [SRB] was ~1 mM per vesicle. SYB2 nanodiscs 
(10011, 50 nM) were injected at the indicated time for 10s and data were recorded 
for an additional 400 s. Leakage and photobleaching of SRB were negligible (<7% 
fluorescence decrease). By contrast, opening of fusion pores led to fluorescence 
decreases of >50%. Fusion probability was defined as the fraction of tethered SUVs 
in which a pore opening event was observed. 

Planar lipid bilayer electrophysiology. Planar lipid bilayer recordings that are 
shown in the main figures were performed using a Planar Lipid Bilayer Workstation 
(BLM) from Warner Instruments!” as described*!. In brief, lipids (75% PE and 
25% PG, at 30 mg/ml in n-decane) were first painted onto a 150-|1m aperture in 
a 1-ml, white Delrin cup (Warner Instruments), allowed to dry for 15 min, and 
then the aperture was bathed in 1 ml of 25mM HEPES, pH 7.5 and 100 mM KCI. 
The lipid solution was gently re-applied to the hole until a conductance-blocking 
seal was formed, as determined by capacitance measurements. This process was 
repeated, either with a brush or an air bubble, until the desired capacitance was 
achieved. Syntaxin-1A/SNAP-25B proteoliposomes (75% PE and 25% PG) were 
then added to the cis chamber of the apparatus; these spontaneously fuse with 
the planar bilayer, thus depositing the t-SNAREs into the BLM. Then, to form 
fusion pores, v-SNARE nanodiscs were added to the cis chamber. Pores form within 
2-30 min and flicker open, or stay open or flicker closed for >90 min. Currents 
were recorded using Bilayer Clamp Amplifier BC-535 (Warner Instrument) and 
a Digidata 1550B (with Humsilencer) acquisition system (Molecular Devices). 
Single-channel recordings were sampled at 10 kHz using pCLAMP10 software 
(Molecular Devices), and filtered at 5 kHz using a multisection Bessel filter. 
Aw = tes — Wtrans (Wtrans = 0 V). All single channel data were analysed using 
Clampfit 10.7 (Molecular Devices) and MS Origin 2016 (OriginLab). Histograms 
of background currents were well fitted by a single Gaussian (centred around 0 pA), 
whereas current histograms of open fusion pores required a multiple Gaussian 
model (with the centre of the additional Gaussian representing the mean pore 
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current). In all figures showing BLM recordings, the representative traces were 
filtered at 1 kHz for display purposes. 

For the experiments reported in Extended Data Table 1, planar lipid bilayer 
recordings were carried out using the Orbit Mini system (Nanion Techologies). 
Membranes were painted onto a MECA chip (Nanion Techologies), and fusion 
pores were formed and analysed as described above, except that data were low- 
pass filtered at 2 kHz. 

The fraction of closed pores (Fig. 4) was calculated using the equation: 


closed dwell time 


Fraction closed = ; 
open dwell time + closed dwell time 


from a 10-min recording at the indicated [cd-SYB2]. 


Pore diameters were calculated using the equation + = (! + a) f 
7 


>> 
r2 


in which y 


Ti 
is the pore conductance, r is the radius, / is the thickness of the bilayer (10 nm; 
assuming the pore is a cylinder that spans both the vesicle and target membrane), 
and pis the resistivity of the buffer (100 2 cm)”. 

Stopped-flow measurements. Equal volumes of the maltose sensor (0.1 1M) and 
maltodextrins (indicated concentrations) were mixed in an SX.18MV stopped-flow 
spectrometer (Applied Photophysics). The samples were excited at 480 nm and the 
emission was collected at 520 nm using a 10-nm bandpass filter. Data were obtained 
from three independent experiments. 

Electrophysiology. Whole-cell voltage-clamp recordings were made using a 
Multiclamp 700B amplifier (Molecular Devices). Recordings were carried out 
at room temperature in a bath solution containing (in mM): 128 NaCl, 5 KCl, 
2 CaCh, 1 MgCh, 30 p-glucose and 25 HEPES, pH 7.3 and 305 mOsm. Patch 
pipettes (3-5 MQ)) were pulled from borosilicate glass (Sutter Instruments). 
The pipette internal solution contained (in mM): 130 K-gluconate, 1 EGTA, 
10 HEPES, 2 ATP, 0.3 GTP, and 5 sodium phosphocreatine, pH 7.35 and 275 
mOsm. Data were acquired using a Digidata 1440A (Molecular Devices) and 
Clampex 10 software (Molecular Devices) at 10 kHz. Neurons were held at —70 mV. 
Series resistance was compensated and recordings were discarded if the access 
resistance rose above 15 M() at any point. AMPARs were pharmacologically iso- 
lated with 2-amino-5-phosphonovalerate (p-AP5) (501M, Abcam) and picrotoxin 
(100)1M, Abcam). For mEPSC recordings, tetrodotoxin (TTX, 11M, Abcam) was 
included in the bath solution. In some experiments, neurotransmitter release was 
evoked by a single stimulus using a concentric bipolar electrode (FHC, 125/50 ,1m 
extended tip). Stimulating electrodes were placed ~100-200 1m from the soma 
being recorded and stimulation currents (0.4-0.7 mA) were adjusted per recording 
to measure the maximum field-evoked current. For these evoked recordings, 
the pipette internal solution was modified to include 130 mM KCI (replacing 
K-gluconate) and 5mM QX-314 chloride (Tocris) and the bath solution was 
modified to include CNQX (10\1M, Abcam) instead of picrotoxin. Traces were 
analysed using Clampfit 10 (Molecular Devices). 


The two sets of experiments reported in Fig. 1 (a-e versus f and g) were 
conducted at different times using independent materials, resulting in slightly 
different values for mESPC frequencies and amplitudes, with no significant effect 
on kinetics. 

Other methods. SDS-PAGE, western blotting, fluorescence spectroscopy, and 
dithionite quenching assays were performed as described previously””. 

Data availability. All original data will be made available by the corresponding 
authors upon reasonable request. For gel source data, see Supplementary Fig. 1. 
Source data for Fig. 2b-d are available online. 
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Extended Data Figure 1 | Viral expression of cd-SYB2. a, cDNA 
encoding the cytosolic domain of SYB2 (cd-SYB2, residues 1-95) was 
cloned into a FUGW transfer vector modified to have a synapsin promoter 
and to co-express soluble eGFP via an IRES sequence; eGFP serves as a 
marker for infection efficiency. For control experiments, eGFP alone was 
expressed. Both constructs were packaged into lentivirus for expression in 
neuronal cultures. b, Representative images of cells stained for a neuronal 
marker (MAP2, magenta) and GFP (green). Images were adjusted for 
brightness and contrast for the sake of presentation. Both preparations 
used for Fig. 1g were examined and had similar GFP expression levels 

and coverage across cells. The scale bar (501m) applies to all nine images 
shown. ¢, Quantification of the ICC demonstrating that both cd-SYB2 
and control viruses achieved a nearly 100% infection rate. Per cent 
infected refers to the number of visually identified MAP2-positive somas 
(that is, neurons) that were also positive for GFP. Three fields of view 
were quantified for each condition. d, Representative traces (left) and 
quantification (right) demonstrating that cd-SYB2 was expressed at levels 
sufficient to inhibit evoked IPSCs triggered by field stimulation (P= 0.032, 
two-tailed t-test; n = 10 neurons for each condition, using two litters of 
mice, three coverslips per condition). Data are presented as mean + s.e.m. 
*P'< 0.05. 
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Extended Data Figure 2 | See next page for caption. 
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Extended Data Figure 2 | Binding of different maltodextrins to the 
maltose sensor, determination of pore sizes and the relative fraction 
of open pores, and characterization of the single vesicle fusion assay. 
a, Fluorescence emission spectra of the maltose sensor in the absence 
or presence of the indicated maltodextrin (top). Equilibrium titration 
of maltodextrin binding to the maltose sensor. The data were fitted with 
a single site binding equation, using Prism 6 (GraphPad), to determine 
the dissociation constants. n = 3 independent experiments. Data are 
presented as mean + s.d. (bottom). b, Kinetics of maltodextrin binding to 
the maltose sensor using stopped-flow (top). The observed rate constants 
(Kobsa) were plotted against maltodextrin concentration. The data were 
fitted with linear functions, yielding the off- and on-rates for binding of 
each maltodextrin to the maltose sensor, as follows: 34 1s~! and 

0.58 + 0.03 1.M~! s~! (maltose), 14-+1s~!and6.7+0.3.M-!s7! 
(maltotriose), and 29+9 s~! and 7.3 +0.2|1M~!s~! (maltoheptaose) 
(bottom). n = 3 independent experiments. Data are presented as 

mean + s.d. ¢, The lengths of the three principal axes for each sugar 
were averaged during 10-ns simulations (left). Error bars indicate s.d. 
from 1,000 snapshots taken every 10 ps during the simulation. Data are 
presented as mean + s.e.m. Representative snapshots of the sugars from 
the simulations are shown as space-filling models (right). d, Pore sizes 
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were determined from the maltodextrin flux assays shown in Fig. 2c (see 
Methods). n =3 independent experiments. e, Representative traces of 
dithionite quenching experiments using ND3 and ND8&. Dithionite was 
added at the indicated time points during fusion reactions to determine 
the degree of protection of NBD. The degree of protection was plotted 
against the incubation time, as shown in Fig. 2d. Similar results were 
obtained in three independent trials. Quenching by dithionite is much 
faster than cargo release (for example, Fig. 2b). This is because the kinetics 
of most of the dithionite quenching that was observed was not a reflection 
of its influx via fusion pores, as more than 50% of the NBD-PE is on 

the outer leaflet. It is likely that dithionite can readily enter even small, 
flickering fusion pores, such as those formed by ND3, because it is smaller 
(174.11 Da) than the smallest maltodextrin used in this study (maltose; 
360.31 Da). Also, the dithionite is present at high concentrations (5 mM). 
f, Plot of fusion probability observed using the indicated nanodiscs; 

the black bars indicate experiments in which t-SNARE SUVs were 
pre-incubated with cd-SYB2 to prevent trans-SNARE pairing. Data are 
presented as mean + s.d. g, Histograms of the fluorescence intensities of 
the tethered t-SNARE vesicles. n = 54 (ND3), 51 (ND5), and 53 (ND7) 
traces obtained from four independent trials under each condition. 
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Extended Data Figure 3 | Characterization of the nanodisc-BLM 
system: effect of t-SNARE density and detection of multiple pores. 
a, b, Fusion pores were formed using ND5 and BLMs with different 
t-SNARE densities. t-SNARE density was varied by using SUVs 

that harboured 100 (SUV;-snarE (100))s 200 (SUV:-sNARE (200)) or 

400 (SUVi-snare (400)) Copies of the SNAP-25B/syntaxin1A heterodimer 
per liposome. As SUV,-snare (200) and SUVi-snare (400) resulted in fusion 
pores with similar sizes and kinetics properties, SUVt-snarg (200) Was 
used for all other experiments in this study. 1 =5 for SUVi-snare (100) and 
20 for (SUVi-sNARE (200))3 n=5 for SUV:-sNARE (400): The representative 
traces (a) correspond to data points demarcated with red arrows in the 


LETTER 


plot of current against t-SNARE copy number (b). Data are presented as 
mean + s.e.m. c, Estimation of the t-SNARE density in the BLMs used 
to form fusion pores. Typical recording showing that multiple t-SNARE 
SUVs, bearing a single gramicidin pore, fuse with the planar lipid bilayer. 
d, Histogram of the number of gramicidin pores formed, as shown in 

c, from 21 trials. Histogram of the number of gramicidin pores formed 
(n=21). e, Multiple pores sometimes form in the nanodisc-BLM assay. 
e, Example of a recording (SUV;-snare (200) and ND5) in which a second 
pore appeared (top). Current histograms of the recording in the upper 
panel are shown (bottom). Similar results were obtained in fifteen 
independent trials. 
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Extended Data Figure 4 | Nanodisc-BLM fusion pore properties at 
various time points. a, Typical recording of a fusion pore formed using 
ND5; this pore eventually closed after ~100 min. Similar results were 
obtained in eleven independent trials. b, Stability of fusion pores in the 
nanodisc-BLM assay. Current histograms of nanodisc-BLM assays using 
SUV;-snare (200) and NDO, ND3, ND5, or ND7 at different time points in 
the recordings. There were no significant differences at the beginning, 
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middle, or end of a recording session, therefore fusion pores are stable. 
The baseline was also stable over the course of all recordings reported in 
this study. n = 14 for ND3; n = 20 for ND5 and ND7. For clarity, the closed 
state is shown in black and the open state is shown in red. In the case of 
ND0 and ND3, a cyan box is included to mark the appearance of open 
pores in ND3. 
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often dilate. a—c, Current histograms (left) and representative traces 


(right) of dilating fusion pores formed using ND3 (a, n=7), ND5 


(b, n=9) and ND7 (c, n= 10). d, Fraction of time for which pores are 
open. As fusion pores often dilated, we analysed the currents during 

an early phase of their initial open state (0.5 s after pore formation). 
Increasing the copy number of SNAREs per nanodisc resulted in larger 
pores!* (a-c) that spent more time in the open state (before they dilated; 
d). Data are presented as mean + s.e.m. e, A subpopulation of fusion 
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pores formed using 50-nm nanodiscs fails to dilate. Representative traces 
(left), current (middle) and open dwell time (right) histograms of non- 
dilating fusion pores (observed in 5 out of 14 trials) formed using 50-nm 
ND5. These pores exhibit well-defined open and closed states. There 

is some degree of heterogeneity regarding the v-SNARE copy number 
per nanodisc? (Fig. 3c). The non-dilating pores are likely to arise from 
nanodiscs with the lower v-SNARE densities, consistent with a model in 
which SNARE density drives dilation”. 
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Extended Data Figure 6 | Characterization of fusion pores formed by 
yeast SNAREs. a, Illustration of pores formed using the yeast SNARE 
complex comprising Ssolp, the appropriate fragment of Sec9c (residues 
401-651), and Snc2p. b, Typical recordings of fusion pores formed using 
NDO, ND3, ND5, and ND7. ¢, d, Open dwell-time histogram (c) and a 
scatter plot of the currents (d) that result from fusion pores formed using 
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= 14). ANOVA P< 0.001; linear 


trend post hoc P< 0.001. Red arrows in d indicate the representative pores 
shown in c. Data are presented as mean +s.e.m. There is a significant 
increase in pore size and stability as the v-SNARE copy number is 
increased. The rate constants for pore closure are reported in Extended 


Data Table 2. ***P< 0.001. 
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Extended Data Table 1 | Trans-SNARE pairing underlies the formation of fusion pores 


# of pore | total # of Odds ratio Fishers exact tesk* 
events trials 95% C.1.]* 


v-SNARE ND/t-SNARE vesicles 12 51 NA NA 

v-SNARE ND/t-SNARE vesicles & cd-SYB2 1 53 0.063 [0.008, 0.501] p < 0.001 
v-SNARE ND/t-SNARE vesicles & cd-t 2 50 0.135 [0.029,0.641] p = 0.004 
v- SNARE ND/syntaxin-alone vesicles 0 51 0 [NA] p < 0.001 
protein free ND/t-SNARE vesicles 0 50 0 [NA] p < 0.001 
v-SNARE ND/protein free vesicles 0 53 0 [NA] p < 0.001 


*Compared to v-SNARE nanodisc-t-SNARE vesicles. n=4 biologically independent samples. 
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Extended Data Table 2 | Rate constants for closure of fusion pores formed by neuronal and yeast SNAREs 


ND3 (neuronal) 3.4 + 0.09 0.3 + 0.06 


ND5 (neuronal) 0.01 + 0.0001 0.0008 + 0.00001 


ND7 (neuronal) 0.002 + 0.000014 0.00008 + 0.00006 


ND3 (yeast) 1.4+0.05 0.4 + 0.03 
ND5 (yeast) 0.005 + 0.00006 0.001 + 0.00001 
ND7 (yeast) 0.002 + 0.00005 0.00015 + 0.00001 


n= 14, 20 and 20 independent BLMs for ND3, ND5, and ND7 (using all neuronal SNAREs), respectively, and five different sets of nanodiscs. n= 10, 14 and 14 independent BLMs for ND3, ND5, and 
ND7 (using all yeast SNAREs), respectively, and five different sets of nanodiscs. Data are presented as mean +S.e.m. 


© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


CORRECTIONS & AMENDMENTS 


CORRIGENDUM 
doi:10.1038/nature25161 


Corrigendum: Phylogenetic ctDNA 
analysis depicts early-stage lung 
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For 6 of the 96 patients included in this Article (patients CRUK0014, 
CRUK0030, CRUK0048, CRUK0059, CRUK0096 and CRUK0097) 
incorrect tumour volumetric data and positron emission tomography 
(PET) tumour background ratio (TBR) data were analysed. This 
error occurred because of the incorrect assignment of patient identi- 
fiers during the anonymization mandated by the independent review 
board of pre-operative computed tomography (CT) scans belonging 
to these patients. Data relating to this error were presented in Figs 2a 
and 3a and b, Extended Data Figs 3d and 4c-f, Extended Data Table 2b 
and Supplementary Table 1. The reanalysis of correctly anonymized 
scans does not influence the conclusions of this Article and correlation 
coefficients improve following inclusion of the corrected data. These 
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errors have been corrected online in the original Article. The authors 
apologize for any confusion these errors may have caused. 

Quartiles for the heat map in Fig. 2a have been redefined after 
including the correct data to reflect changes in quartiles for 3 (of 92) 
PET TBR values and 7 (of 95) volume parameters. The Source Data 
file supplied for Fig. 2 was not uploaded on publication; the corrected 
Source Data file for Fig. 2 is now available in the HTML version of the 
original Article. 

The plot and legend for Fig. 3a have been corrected to reflect updated 
volumetric data for the two patients affected by the correction who were 
analysed in this figure (CRUK0096 and CRUK0097). CRUK0096 was 
excluded from the updated volumetric analysis based on a criterion 
applied to our original analysis (large cavity within primary tumour). 
Consequently, the sentence in the legend to Fig. 3a “n = 38, grey vertical 
lines represent range of clonal VAE, red shading indicates 95% confi- 
dence intervals (CIs)” has been updated to read “n = 37” and in the 
Methods section ‘Statistical data analysis’ the line “8 out of 46 patients 
were not included in the analysis: CRUK0036 had no preoperative CT 
scan available; CRUK0087 had a large cavity inside the primary” has 
been updated to read “9 out of 46 patients were not included in the 
analysis: CRUK0036 had no preoperative CT scan available; CRUK0087 
and CRUK0096 had a large cavity inside the primary cancer”. 

In Fig. 3b and the main text, the variant allele frequency (VAF) 
prediction values (based on tumour volume), confidence intervals 
and estimated malignant cell number contributing to a VAF of 0.1% 
have been updated. In the section ‘Determinants of ctDNA detection 
in NSCLC; confidence intervals in the sentence “a primary tumour 
burden of 10cm? would result in a mean clonal plasma VAF of 0.1% 
(95% confidence interval, 0.05-0.17%)” have been altered to read “a 
primary tumour burden of 10cm? would result in a mean clonal VAF 
of 0.1% (95% confidence interval, 0.06-0.18%)” and the sentence “a 
plasma VAE of 0.1% would correspond to a primary NSCLC malignant 
burden of 326 million tumour cells” has been altered to read “a plasma 
VAF of 0.1% would correspond to a primary NSCLC malignant 
burden of 302 million tumour cells” In the ‘Discussion’ section the 
sentence “on the basis of the relationship between tumour volume 
and ctDNA plasma VAF observed in this study, a tumour volume of 
0.034 cm? would equate to a plasma VAF of 1.4 x 1074% (95% confi- 
dence interval, 6.4 x 10~°-0.0031%)”, has been altered to read “on the 
basis of the relationship between tumour volume and ctDNA plasma 
VAF observed in this study, a tumour volume of 0.034 cm? 
would equate to a VAF of 1.8 x 10°-4% (95% confidence interval, 
9.8 x 10° °-0.0033%)”. 

Further figure corrections pertaining to the six affected patients 
in Extended Data Figs 3d and 4c-f, Extended Data Table 2b and 
Supplementary Table 1 of the original Article are described and 
corrected in the Supplementary Information of this Corrigendum, 
which also shows the original, wrong Figs 2a and 3a and b. The 
Supplementary Data (containing Supplementary Table 1) of the original 
Article has been corrected. 


Supplementary Information is available in the online version of this 
Corrigendum. 
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SUSTAINABILITY 


Recycling’s 
liquid assets 


Going green doesn’t just help the planet — it also 
puts more money in your pocket for research. 


BY ELIE DOLGIN 


he freezers were stuffed and their racks 
encrusted in ice, with a thin blanket 
of snow covering all the sample boxes 


inside. Such was the state of the cold-storage 
system in Hopi Hoekstra’s laboratory a decade 


after the evolutionary biologist and her team 
started studying the genetics and behaviour of 
deer mice there. 

Kyle Turner, manager of the lab at Harvard 
University in Cambridge, Massachusetts, was 
about to spend more than US$10,000 on a new 
ultra-low-temperature (ULT) freezer. Then he 


heard about a competition called the North 
American Laboratory Freezer Challenge, 
which had been launched in January 2017 by 
two US non-profit organizations — My Green 
Lab, in Los Gatos, California, and the Inter- 
national Institute for Sustainable Laboratories 
(I2SL), in Annandale, Virginia. 

The challenge, which is now international 
(go.nature.com/2dyh8xi), urges labs to reduce 
energy consumption and improve equip- 
ment life through various measures. Some 
of those include defrosting freezers, to elimi- 
nate crusty ice and provide more space for 
samples, and raising the temperature set-point 
on ULT freezers from —80°C to —70°C, to cut 
electricity demands. 

The Hoekstra lab won first place in the 
individual-laboratory category for an academic 
institution. Lab members also freed so much 
space in their two existing ULT freezers that, 
despite accumulating new research materials, 
they haven't yet needed to buy a third. 

The energy savings helped to cut Harvard's 
electricity bill by around $2,500 a year, accord- 
ing to My Green Lab, and slashed annual 
greenhouse-gas emissions by the equivalent 
of 4.1 tonnes of carbon dioxide — roughly 
what would be saved by taking three cars off 
the road. It also meant that Hoekstra’s lab could 
spend the funds earmarked for a new freezer 
on other science-related expenses instead. 

Hoekstra likens it to “a free $10,000 grant” — 
and is using the money to send some trainees 
to this August’s Joint Congress on Evolutionary 
Biology in Montpellier, France. The funds will 
also help to support a high-throughput gene- 
expression analysis of brain cells from two 
related species of deer mouse. 

Campus sustainability initiatives are usually 
framed as ways for scientists to shrink their 
carbon footprints and bring down energy 
costs (see Nature 546, 565-567; 2017). But the 
Hoekstra lab’s experience shows that there are 
other reasons to pool surplus reagents, share 
equipment or keep better tabs on lab chemi- 
cals to avoid duplicate purchasing. “These 
exercises are about helping science as much 
as they are about helping the planet,” says 
Peter James, director of S-Lab, a UK initiative 
based in London that promotes sustainable lab 
practices. “They free up resources that can be 
applied for scientific purposes.” 


BOUNTY HUNTERS 

One increasingly popular way to cut lab waste 
and operational costs is through exchange 
programmes for surplus resources. At > 
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> the University of Michigan, Ann Arbor, 
for example, more than 230 research and 
teaching laboratories now routinely share 
leftover chemicals, equipment and materi- 
als through a campus-wide recycling and 
reuse initiative. 

“Before this programme, these were 
thrown in the trash or disposed of as hazard- 
ous waste for a price,’ says Sudhakar Reddy, 
who coordinated the university's sustainabil- 
ity efforts until his retirement last December. 
Now, he estimates, more than one-third of all 
unexpired and unused lab resources get passed 
on to other researchers, who leap on the sur- 
plus bounty — saving themselves a combined 
total of more than $250,000 a year. 

One new recruit, pulmonary-health 
researcher Benjamin Singer, freely acquired 
two high-end microscopes — valued at more 
than $6,000 apiece — which he now uses to 
study donated human-brain specimens for 
molecular signs of injury after a critical illness. 
A second researcher, cell biologist Anthony 
Vecchiarelli, saved more than $10,000 while 
kitting out his lab with free peristaltic pumps, 
circulating water baths, slide warmers and con- 
sumables. “I check the website almost weekly 
for goodies,’ says Vecchiarelli. “It is a valuable 
resource for a new investigator.” 

Not all academics have such a website 
at their fingertips, however. Garry Cooper 
didn’t when he was a postdoc study- 
ing neurophysiology at the Northwestern 
University Feinberg School of Medicine in 
Chicago, Illinois. And it was while he was 
helping to clean out a lab freezer one day in 
2015 that he realized there was a need for such 
a platform: he’d been handing a PhD student 
some expensive reagents, but still throwing 
away bagfuls of antibodies, a common, yet 
pricey, research tool for identifying proteins. 

He decided to create a company to reduce 
wasteful spending and promote trading 
among colleagues. He envisaged it as a kind of 
eBay, Craigslist and Ask.com rolled into one, 
providing lab scientists with a valuable service 
at a time when research funding is increas- 
ingly hard to come by (see “Too much of a 
good thing’). He called the start-up Rheaply, a 
portmanteau of ‘research and ‘cheaply 

After developing a web-based platform, 
Cooper and his company launched a pilot 
programme at Northwestern’s medical 
school last year. In its first 6 months, around 
300 researchers — close to one-third of all 
lab scientists on the medical campus — cre- 
ated Rheaply accounts. According to Cooper, 
who remains a visiting scholar at Northwest- 
ern, those users collectively posted around 
200 items, ranging from pipettes and glassware 
to chemicals and biological probes; at least 
55 items were passed on, saving labs across the 
campus more than $25,000 and keeping those 
resources out of landfills. 

Khalid Alam is one Rheaply user. Just 
last month, he got hold of an $800 vacuum 


TOO MUCH OF A GOOD THING 
Why lab stock lies idle 


Before launching Rheaply, an online 
platform where scientists can buy, 

sell, trade or donate surplus labware 
and supplies, Garry Cooper surveyed 
120 academic researchers at 
Northwestern University in Chicago, 
Illinois, to learn more about why reagents 
and equipment go unused, and whether 
scientists would be willing to donate 
surplus supplies. Most respondents said 
they had extra lab provisions that they 
would gladly give to colleagues. Here’s 
asummary of Cooper’s findings: 


Top reasons for reagents and equipment 

going unused or remaining in surplus 

@ Initial/pilot experiments failed (71.6%) 

@ Initial experimental needs changed 
(63.6%) 

@ Original purchaser leaves lab (56.8%) 

@ Starting quantity too large (64.5%) 

@ Items stored in secluded areas 
(18.2%) 

@ Double ordering (15.9%) 


Types of reagents and equipment that 

go unused or remain in surplus 

@ Chemicals (80.2%) 

@ Antibodies/biologics (38.4%) 

e@ Kit reagents (37.2%) 

@ Glassware (27.9%) 

@ Imaging dyes/agents (25.6%) 

@ Tools (16.3%) 

@ Tissue/cell-culture items (15.1%) 

@ Tubing (12.8%) 

@ Microscopy equipment/accessories 
(10.5%) 

@ Computer software (8.1%) 


pump for his postdoctoral research into RNA 
engineering — although in general, he says, 
“there’s not a tonne of stuff on there”. That’s one 
of the main problems with any environmen- 
tally minded programme aimed at scientists, 
says Michael Blayney, executive director of the 
Office for Research Safety at Northwestern. 
“The challenge is: how do you encourage and 
motivate people to interact with it?” 


TANGIBLE BENEFITS 

Amorette Getty is involved in a number of 
waste-reduction initiatives. One is at the 
University of California, Santa Barbara, 
where she co-directs a programme called 
LabRATS (short for Laboratory Resources, 
Advocates and Teamwork for Sustainability) 
that encourages shared use of surplus chemi- 
cals and instrumentation. She says that scien- 
tists are most likely to pitch in for those efforts 
that offer them personal, tangible benefits 
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— although these needn't be directly monetary. 
“Any time I can connect the things I’m trying 
to do to increase safety and research efficiency, 
or get better storage to protect samples — that’s 
when I have my greatest successes,” she says. 

That same ethos underpins moves by three 
institutes at the University of Aberdeen, UK, 
to centrally manage ULT freezers and raise the 
operating temperature to —70°C. The initia- 
tive, says Peter McCafferey, a brain researcher 
who previously led the university’s Freezer 
Protocol Group, is as much about research 
resilience and reliable sample preservation as 
it is about energy efficiency. “We have all the 
freezers together, which makes it easier to keep 
an eye on,’ he explains. 

But Cooper reckoned that people would 
need more motivation before adopting such 
practices. To help Rheaply catch on, he devised 
a point-based system that rewards online 
engagement and activity. So far, Cooper has 
convinced a handful of large academic and pri- 
vate clients to sign up, and he hopes to close 
deals soon with several prominent universities 
and government research agencies, including 
the US National Institutes of Health. 

Expanding that idea of collective action 
offers additional opportunities for cutting 
costs. Most universities already have core 
facilities for specialized equipment, technolo- 
gies and services, but a few are now taking this 
centralized approach further in how they set 
up their labs. 

Take cell-culture work, for example. This 
line of research requires fairly basic equip- 
ment — laminar-flow hood, incubator, cell 
counter, microscope, centrifuge, cryostorage 
tanks — all of which is priced within the 
budget of a typical lab. According to a survey 
of biosafety officers at member institutions of 
the Association of American Universities, 86% 
of cell-culture spaces remain private, used only 
by individual labs. 

But at the University of Colorado Boulder, 
the Biochemistry Cell Culture Facility is shared 
by 70 users from 16 labs, all of whom chip in 
to pay the salary of a single facility manager. A 
case study of the collaborative research space, 
published earlier this year, compared the facil- 
ity’s approach with a hypothetical situation in 
which all the labs worked on cell culture inde- 
pendently (see go.nature.com/2fwzjhm). The 
study found that centralizing media prepara- 
tion and other tasks, instead of getting gradu- 
ate students and postdocs in each group to 
perform these jobs, saved each lab more than 
nine hours a week. 

Other savings, achieved through bulk 
purchasing and the use of recycled ethanol, 
for example, helped the biochemistry depart- 
ment and individual labs to collectively cut 
their expenses by around $195,000 per year, 
the analysis showed. Their efforts saved the 
university a further $71,000 each year by 
reducing energy bills and lowering the costs 
of ventilation and lab maintenance. “There's so 


much cost avoidance,’ says Kathy Ramirez- 
Aguilar, programme manager of the univer- 
sity’s Green Labs Program, who conducted 
the study with her deputy, Christina Greever. 

Robert Kuchta, an enzymologist who 
uses the facility, points to a less obvious, 
environmental benefit of the sharing system. 
“It dramatically reduces liquid-nitrogen 
usage,” he says. That’s because containers 
used to store liquid nitrogen are typically 
cylindrical, and many small cylinders, of the 
type that might be used by individual labs, 
have a larger collective surface area — and 
thus a higher rate of nitrogen evaporation 
— than does a single, large cryopreserva- 
tion tank of the same volume that can store 
samples in one place. 

Even without access to a joint facility, 
individual labs can still realize some of these 
gains by taking advantage of laboratory- 
management soft- 


ware. An automated “You find ways 

inventory system to pack the 

can free money gqmewaste 

that would other- together — and 

wise be spent on 7g quite often 

paying someone the same price, 

to keep tabs on because you’re 

the thousands of disposing of 0 

reagents commonly Li £0. sa 
package. 


used by large chem- 
istry labs. And it 
can save researchers from making wasteful 
purchases because they can't find existing 
stock on the shelves. 

What’s more, just as members of the 
University of Colorado’s shared facility 
can pool their hazardous junk for disposal 
— reducing the number of times sterilized 
autoclaves are inefficiently run half-empty, 
and getting a better deal from waste-disposal 
companies — so, too, can individual labs that 
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share a common chemical-tracking system. 

“You find ways to pack the same waste 
together — and it’s quite often the same 
price, because youre disposing of one pack- 
age; says Marcus Phelan, a chief technical 
officer and dangerous-goods safety adviser 
at Trinity College Dublin, where chemistry 
labs all use a cloud-based inventory system 
called LabCup. 


A NEW LIGHT DAWNS 

As well as benefiting from campus-wide 
initiatives, scientists can take individual 
action that will simultaneously save money, 
the environment and the integrity of their 
research. 

For example, labs with fluorescent micro- 
scopes can replace mercury lamps with 
light-emitting diodes (LEDs), which are less 
toxic and more energy-efficient. According 
to Allison Paradise, executive director of 
My Green Lab, LEDs are better for science 
because they provide a more consistent 
light source than do mercury lamps, which 
degrade over time and make it hard to quan- 
titatively compare images from different 
time points in an experiment. Buoyed by 
the success of the freezer challenge, Paradise 
says that she is in discussions with sponsors 
to set up a similar initiative, this time aimed 
at eliminating mercury from microscope 
lamps. If she’s successful, that effort will 
launch later this year. 

Ultimately, it might take a greater atten- 
tion to sustainability and efficiency across 
the entire research enterprise for the big- 
gest benefits to accrue, both financially and 
environmentally — in which case, scientists 
and funding agencies must band together to 
make that goal a priority. 

Individual labs might not have to pay the 
energy bills out of their own research grants, 
but facilities fees are part of the funding infra- 
structure, through what’s often referred to as 
‘indirect costs. Bringing those costs down 
could make more funds available for sala- 
ries, travel, equipment and other expenses 
that more directly support scientists and their 
research projects. 

So far, there’s little incentive for indi- 
vidual scientists to do their part. However, 
with many funding agencies emphasizing 
the need to justify the broader impacts of 
proposed research, Ramirez-Aguilar argues 
that implementing energy-efficient and 
environmentally sustainable lab practices 
can be a smart way for researchers to make 
their grants stand out. It might seem a small 
detail, but having such procedures in place 
could make all the difference to the success of 
your application. “If it makes your proposal 
look better,” she says, “you're more likely to 
get funding” m 


Elie Dolgin is a science writer in Somerville, 
Massachusetts. 
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TRADE TALK 
River reader 


Formerly an Arctic 
hydrologist at 

the University of 
Alaska Fairbanks, 
Jessie Cherry is now 
a senior hydrologist 
with the US 
Alaska-Pacific River 
Forecast Center in 
Anchorage, where 
she predicts river levels and flow. Shortlisted 
twice for NASA’ astronaut programme, she 
is also a commercial bush pilot with two 
single-engine planes. 


Why did you leave academia? 

I loved Earth science and being outside. But 
I spent most of my time finding funding for 
my research programme and staff. And as the 
chief scientist of the Geographic Information 
Network of Alaska, I had to raise another 
US$2 million a year. I was also unhappy 
with the shift towards projects with multiple 
principal investigators. 


Why did you get a pilot’s licence? 

Planes are the main form of transport in 
Alaska, so a licence is handy. From the air, 
I’ve photographed methane bubbles frozen 
in lakes, and ice build-up under bridges. 


What made you a good candidate for NASA’s 
astronaut programme? 

I applied because the independence required 
to live and work in the Arctic — like doing 
my own plumbing and electrical work — 
made me highly qualified. As a commercial 
pilot, ’'m familiar with aviation and aircraft 
systems, and I can make quick judgement 
calls about safety and risk. 


Describe your job. 

We forecast river levels and flows — floods 
in particular — for public safety. I compare 
measured river observations against forecast 
data, check the weather across Alaska and 
forecast how precipitation will affect river 
flows. And I get to do side projects, such as 
studies of glacial outburst floods. 


Why did you join the forecast centre? 

In academia, I was so overwhelmed with grant 
writing that I couldn't keep up with my field. 
Now I can become an expert in Arctic hydrol- 
ogy and examine the relationship between 
river flows and snowmelt, for example. Plus 
I enjoy the 40-hour working week. = 


INTERVIEW BY SARAH BOON 


This interview has been edited for length and clarity. 
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Ua SCIENCE FICTION 


THESE 5 BOOKS GO 6 FEET DEEP 


BY TED HAYDEN 


rave robbery. If you think it’s a relic 
(5: gothic novels, think again. 

Now that the first generation of 
body-modified tech-bros and computer- 
implanted one-percenters sleep under tomb- 
stones, there's a ton of gear in the ground. 
These books teach you everything you need 
to know to go get it — or, if rotten flesh 
makes you retch, to live vicariously through 
those who earn their living digging. 


The Modern Grave Robber’s How-To 
Guide 
by Anonymous 
You can't buy this practical page-turner on 
Amazon. Even if you find a store where it’s 
sold (pro tip: low-key ask an associate in 
Home Depot’s lawn and garden depart- 
ment), don't pay with your credit card — you 
might wind up on a government watch-list. 
Written by a grave robber with loads of 
real-world experience, chapters include 
‘How to bribe cemetery staff’ and ‘Detach- 
ing valuable limbs. But be careful — if your 
dead grandpa got dug up on a dark night, 
you might see his decapitated head in this 
guide's useful (and graphic!) pictures. 


Second-Hand Subcutaneous Implants: 
Identification and Value Guide 

by Norm Sadowski 

Originally written for medical professionals, 
this has become a grave-robbing essential, 
the Kelley Blue Book of the cemetery set. 
Not even the freakiest of the freaky exhume 
corpses for a love of the stench — they’re in it 
for the cash, and Second-Hand Subcutaneous 
Implants breaks down the numbers. 

Although less-affluent families raid 
loved-one’s corpses before burial, and some 
debt-riddled morticians steal modifications 
before sending clients to the crypts, most 
corpses are buried with at least a few micro- 
chips still implanted in their bodies. 

Dug up a geezer who died at 90? He'll prob- 
ably have had memory-enhancing neural 
prostheses implanted after a stroke or an 
Alzheimer’s diagnosis. Market price $5,000. 
Found an athlete who paraglided into a sky- 
scraper window? Check her limbs for genetic 
mod microchips. Street value is $7,500. 

Remember, though, this book wasn’t 
written with crooks in mind, so approach its 
pages using common sense. For example, a 
brain-embedded password-tracking implant 
with safe-box codes is worth way more than 


Unearthing the truth. 


spare parts if the deceased’s family hasn't 
deactivated any accounts. But if they have? 
You'll get the list price and no more. 


Coffins, Corpses and Crime: A Life 
by Wojciech Bajor 

Caught in the act of dis- 
membering a just-buried 
Silicon Valley chief 
executive, convicted 
of crimes including 
larceny and mayhem, 
then released after his 
conviction was over- 
turned on a technical- 
ity, Wojciech Bajor is a 
grave-robbing legend. 

There's the story of how 
Uber’s chief executive, more 
machine than man by the time 
he caught his final rideshare, spent 
his famously short evening underground. The 
morning after the Bay Area bigwig’s funeral, 
cemetery guards discovered an empty hole 
by his headstone. Bajor spills all the juiciest 
details, explaining how he performed this and 
other heists by training implant-sniffing dogs 
and romancing lonely-heart morticians. 

But the book’s not all big capers and 
bigger stakes — Bajor got high from his own 
supply, keeping the best implants for himself 
and paying out-of-work surgeons to insert 
them into his brain and body. He says the 
second-hand microchips betrayed him, 
sneaking their original owners’ angry spirits 
into his limbs and colluding on an undead 
plan that forced him to unconsciously make 
the mistakes that led to his arrest. 


Welcome to the Underworld: My Year 
with the Body Snatchers 

by Colton Venkatesh 

To ingratiate himself into a clique of grave 
robbers, Venkatesh, a sociology professor at 
the University of California, Riverside, took 
part in their induction ceremony, locking 
himself inside a coffin filled with rotting 
cats, human limbs and web-weaving spiders. 

After that long night, he tagged along 
as the crew broke into graveyards, visited 
black-market fairs where fences and thieves 
bargained over bioelectronic implants, and 
partied at some of the wildest bacchanals 
this side of the river Styx. 

With an eye for striking details, Venkatesh 
guides his readers through grave-robbing 
fashion (all that death plus all that manual 
labour means these guys rock a serious 
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health-goth look), secret handshakes (the 
secret is, they don't do it with their own 
hands) and superstitions (Wojciech Bajor 
isn't the only one who thinks ghosts haunt 
the stolen goods — most grave robbers 
keep roosters inside their homes, 
talismans that are said to keep 
vengeful spirits at bay). 


The Digital 
Afterlife: How Body 
Modifications 
Became Conscious 
by Willa Weaver 
Dont believe in ghosts? 
Neither does Willa 
Weaver. Her pioneering 
work at the Berlin Insti- 
tute for Advanced Study 
suggests that the microchips we 
use to increase our strength, amplify 
our memory and fight diseases might also 
haunt our minds. 

In The Digital Afterlife, Weaver describes 
the case of Hanna Miller, who developed a 
neurogenic stutter after receiving a second- 
hand bioelectronic arthritis counteragent. 
Tracing the device's provenance, she discov- 
ered it had previously been implanted in a 
man who stuttered throughout his life. 

Weaver's research has uncovered hun- 
dreds of parallel cases, in which implants 
transferred cases of Tourette’s, turned tone- 
deaf amusiacs into musical prodigies, and 
gave broke welfare cases hyper-specific 
knowledge of stock-market trends. 

But what’s most startling is her final 
hypothesis. Almost all subcutaneous prod- 
ucts send a constant stream of data to manu- 
facturers, who use that information to perfect 
new products. Weaver believes that this com- 
bined knowledge has become a collective 
mind living and operating inside our bodies. 

How else to explain the fact that, one year 
ago, dozens of people using the SR-12 Hear- 
ing Implant found themselves congregating 
on the side of an empty road in the Sonoran 
Desert? Something brought them there, and 
it wasnt a friendly e-mail chain or a one-time 
travel discount from American Airlines. 

Her final warning: “To those who dig 
beneath the skin: dig carefully. Who knows 
what might try to dig its way out.” m 


Ted Hayden lives in Southern California. 
His stories have been published in LOw L1f3 
#3 and Angry Old Man Issue 2. Find out 
more at tedhaydenstories.com. 
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